---
layout: default
title: Accepted Papers
hide: true
navigation_weight: 11
---

<h1>AISTATS 2020 Accepted Papers</h1>
<style type="text/css">
  b { font-weight: bold; }
</style>
<ul style="font-size: 10pt">
<li><b>Linearly Convergent Frank-Wolfe without Line-Search</b><br />Fabian Pedregosa (Google)*; Geoffrey Negiar (UC Berkeley); Armin Askari (UC Berkeley); Martin Jaggi (EPFL)</li>
<li><b>Guarantees of Stochastic Greedy Algorithms for Non-monotone Submodular Maximization</b><br />Shinsaku Sakaue (NTT)*</li>
<li><b>On Maximization of Weakly Modular Functions: Guarantees of Multi-stage Algorithms, Tractability, and Hardness</b><br />Shinsaku Sakaue (NTT)*</li>
<li><b>Adaptive Trade-Offs in Off-Policy Learning</b><br />Mark Rowland (DeepMind)*; Will Dabney (DeepMind); Remi Munos (DeepMind)</li>
<li><b>Conditional Importance Sampling for Off-Policy Learning</b><br />Mark Rowland (DeepMind)*; Anna Harutyunyan (DeepMind); Hado van Hasselt (DeepMind); Diana Borsa (DeepMind); Tom Schaul (DeepMind); Remi Munos (DeepMind); Will Dabney (DeepMind)</li>
<li><b>Multiplicative Gaussian Particle Filter</b><br />Xuan Su (National University of Singapore)*; Wee Sun Lee (National University of Singapore); Zhen Zhang (University of Adelaide	)</li>
<li><b>Stretching the Effectiveness of MLE from Accuracy to Bias for Pairwise Comparisons</b><br />Jingyan Wang (Carnegie Mellon University)*; Nihar Shah (CMU); R Ravi (CMU)</li>
<li><b>Fast and Accurate Ranking Regression</b><br />Ilkay Yildiz (Northeastern University)*; Jennifer Dy (Northeastern); Deniz Erdogmus (Northeastern University); Jayashree Kalpathy-Cramer (MGH/Harvard Medical School); Susan Ostmo (Oregon Health & Science University); J. Peter Campbell (Oregon Health & Science University); Michael F. Chiang (Oregon Health & Science University); Stratis Ioannidis (Northeastern University)</li>
<li><b>Tight Analysis of Privacy and Utility Tradeoff in Approximate Differential Privacy</b><br />Quan Geng (Facebook, Inc.)*; Wei Ding (Google); Ruiqi Guo (Google); Sanjiv Kumar (Google Research)</li>
<li><b>Long-and Short-Term Forecasting for  Portfolio Selection with Transaction Costs</b><br />Guy Uziel (Technion)*; Ran El-Yaniv (Technion)</li>
<li><b>Nonparametric Sequential Prediction While Deep Learning the Kernel</b><br />Guy Uziel (Technion)*</li>
<li><b>Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation</b><br />Yuxuan Song (Shanghai Jiao Tong University)*; Ning Miao (ByteDance AI Lab); Hao Zhou (Bytedance); Lantao Yu (Stanford University); Mingxuan Wang (Bytedance); Lei Li (ByteDance AI Lab)</li>
<li><b>A Double Residual Compression Algorithm for Efficient Distributed Learning</b><br />Xiaorui Liu (Michigan State University)*; Yao Li (Michigan State University); Jiliang Tang  (Michigan State University); Ming Yan (Michigan State University)</li>
<li><b>Asynchronous Gibbs Sampling</b><br />Alexander Terenin (Imperial College London)*; Daniel P Simpson (University of Toronto); David Draper (University of California, Santa Cruz)</li>
<li><b>Learning Fair Representations for Kernel Models</b><br />Zilong Tan (Carnegie Mellon University)*; Samuel Yeom (Carnegie Mellon University); Matt Fredrikson (Carnegie Mellon University); Ameet Talwalkar (CMU)</li>
<li><b>A Nonparametric Off-Policy Policy Gradient</b><br />Samuele Tosatto (TU Darmstadt)*; Joao Carvalho (University of Freiburg); Hany Abdulsamad (Technische Universität Darmstadt); Jan Peters (TU Darmstadt + Max Planck Institute for Intelligent Systems)</li>
<li><b>Non-Parametric Calibration for Classification</b><br />Jonathan Wenger (University of Tübingen)*; Hedvig Kjellström (KTH Royal Institute of Technology); Rudolph Triebel (German Aerospace Center (DLR))</li>
<li><b>Minimax Testing of Identity to a Reference Ergodic Markov Chain</b><br />Geoffrey Wolfer (Ben-Gurion University of the Negev)*; Aryeh Kontorovich (Ben-Gurion University of the Negev)</li>
<li><b>A Linear-time Independence Criterion Based on a Finite Basis Approximation</b><br />Longfei Yan (Victoria University of Wellington)*; W. Bastiaan Kleijn (Victoria University of Wellington); thushara abhayapala (The Australian National University)</li>
<li><b>Minimax Bounds for Structured Prediction Based on Factor Graphs</b><br />Kevin Bello (Purdue University)*; Asish Ghoshal (Purdue University); Jean Honorio (Purdue)</li>
<li><b>On the Convergence of SARAH and Beyond</b><br />Bingcong Li (University of Minnesota)*; Meng Ma (University of Minnesota); Georgios B. Giannakis (University of Minnesota)</li>
<li><b>Uncertainty in Neural Networks: Approximately Bayesian Ensembling</b><br />Tim D Pearce (University of Cambridge)*; Felix Leibfried (PROWLER.io); Alexandra Brintrup (University of Cambridge)</li>
<li><b>LIBRE: Learning Interpretable Boolean Rule Ensembles</b><br />Graziano Mita (EURECOM)*; Paolo Papotti (Eurecom); Maurizio Filippone (EURECOM); Pietro Michiardi (EURECOM)</li>
<li><b>Marginal Densities, Factor Graph Duality, and High-Temperature Series Expansions</b><br />Mehdi Molkaraie (University of Toronto)*</li>
<li><b>Neighborhood Growth Determines Geometric Priors for Relational Representation Learning</b><br />Melanie Weber (Princeton University)*</li>
<li><b>Fair Decisions Despite Imperfect Predictions</b><br />Niki Kilbertus (MPI Tübingen & Cambridge)*; Manuel Gomez Rodriguez (MPI-SWS); Bernhard Schölkopf (MPI for Intelligent Systems, Tübingen); Krikamol Muandet (Max Planck Institute for Intelligent Systems); Isabel Valera (MPI for Intelligent Systems, Tübingen)</li>
<li><b>A Characterization of Mean Squared Error for Estimator with Bagging</b><br />Martin Mihelich (Open Pricer & Ecole Normale Supérieure); Charles Dognin (Verisk Analytics)*; Yan Shu (Walnut Algorithms); Michael Blot (Walnut Algorithms)</li>
<li><b>Uncertainty Quantification for Sparse Deep Learning</b><br />Yuexi Wang (University of Chicago)*; Veronika Rockova (University of Chicago)</li>
<li><b>Minimizing Dynamic Regret and Adaptive Regret Simultaneously</b><br />Lijun Zhang (Nanjing University)*; Shiyin Lu (Nanjing University); Tianbao Yang (University of Iowa)</li>
<li><b>A Stein Goodness-of-fit Test for Directional Distributions</b><br />Wenkai Xu (Gatsby Unit, UCL)*; Takeru Matsuda (University of Tokyo, RIKEN CBS)</li>
<li><b>Unsupervised Neural Universal Denoiser for Finite-Input General-Output Noisy Channel</b><br />Taeeon Park (Sungkyunkwan University); Taesup Moon (Sungkyunkwan University)*</li>
<li><b>Leave-One-Out Cross-Validation for Bayesian Model Comparison in Large Data</b><br />Måns Magnusson (Aalto University)*; Aki Vehtari (Aalto University); Johan Jonasson (Chalmers University of Technology); Michael R Andersen (Aalto University)</li>
<li><b>Robust Importance Weighting for Covariate Shift</b><br />Fengpei Li (Columbia University)*; Henry Lam (Columbia University); Siddharth Prusty (Columbia University, NYC)</li>
<li><b>Adaptive Online Kernel Sampling for Vertex Classification</b><br />Peng Yang (Baidu)*; Ping Li (Baidu)</li>
<li><b>A Hybrid Stochastic Policy Gradient Algorithm for Reinforcement Learning</b><br />Nhan H Pham (University of North Carolina at Chapel Hill)*; Lam M Nguyen (IBM Research, Thomas J. Watson Research Center); Dzung Phan (IBM Research, T. J. Watson Research Center); PHUONG HA NGUYEN (UCONN); Marten van Dijk (University of Connecticut); Quoc Tran-Dinh (The University of North Carolina at Chapel Hill)</li>
<li><b>Stopping criterion for active learning based on deterministic generalization bounds</b><br />Hideaki Ishibashi (Kyushu Institute of Technology)*; Hideitsu Hino (The Institute of Statistical Mathematics/RIKEN AIP)</li>
<li><b>Ivy: Instrumental Variable Synthesis for Causal Inference</b><br />Zhaobin Kuang (Stanford University )*; Frederic Sala (Stanford); Nimit S Sohoni (Stanford University); Sen Wu (Stanford University); Aldo Córdova-Palomera (Stanford University); Jared Dunnmon (Stanford University); James R Priest (Stanford University); Christopher Re (Stanford University)</li>
<li><b>High Dimensional Robust Sparse Regression</b><br />Liu Liu (University of Texas at Austin)*; Yanyao Shen (UT Austin); Tianyang Li (UT Austin); Constantine Caramanis (University of Texas)</li>
<li><b>Nested-Wasserstein Self-Imitation Learning for Sequence Generation</b><br />Ruiyi Zhang (Duke University)*; Changyou Chen (University at Buffalo); Zhe Gan (Microsoft); Zheng Wen (DeepMind); Wenlin Wang (Duke Univeristy); Lawrence Carin (Duke University)</li>
<li><b>Greed Meets Sparsity: Understanding and Improving Greedy Coordinate Descent for Sparse Optimization</b><br />Huang Fang (University of British Columbia)*; Zhenan Fan (The University of British Columbia); Yifan Sun (INRIA-Paris); Michael P Friedlander (University of British Columbia)</li>
<li><b>Recommendation on a Budget: Column Space Recovery from Partially Observed Entries with Random or Active Sampling</b><br />Carolyn Kim (Stanford University)*; Mohsen Bayati (Stanford University)</li>
<li><b>Fast Noise Removal for k-Means Clustering</b><br />Sungjin Im (University of California at Merced); Mahshid Montazer Qaem (UC Merced); Benjamin Moseley (Carnegie Mellon University); Xiaorui Sun (Microsoft Research); Rudy Zhou (Carnegie Mellon University)*</li>
<li><b>Sketching Transformed Matrices with Applications to Natural Language Processing</b><br />Yingyu Liang (University of Wisconsin Madison); Zhao Song (IAS/Princeton); Mengdi Wang (Princeton University); Lin Yang (UCLA); Xin Yang (University of Washington)*</li>
<li><b>Unconditional Coresets for Regularized Loss Minimization</b><br />Alireza  Samadian (University of Pittsburgh)*; Kirk Pruhs (University of Pittsburgh); Benjamin Moseley (Carnegie Mellon University); Sungjin Im (UC Merced); Ryan Curtin (RelationalAI)</li>
<li><b>ASAP: Architecture Search, Anneal and Prune</b><br />Asaf Noy (Alibaba)*; Niv Nayman (Alibaba Group); Tal Ridnik (Alibaba); Nadav Zamir (Alibaba); Sivan Doveh (Tel Aviv university); Itamar Friedman (Alibaba); Raja Giryes (Tel Aviv University); Lihi Zelnik (Alibaba)</li>
<li><b>Understanding Generalization in Deep Learning via Tensor Methods</b><br />Jingling Li (UMD)*; Yanchao Sun (University of Maryland, College Park); Jiahao Su (UMD); Taiji Suzuki (The University of Tokyo / RIKEN); Furong Huang (University of Maryland)</li>
<li><b>Accelerating Gradient Boosting Machines</b><br />Haihao Lu (MIT)*; Sai Praneeth Karimireddy (EPFL); Natalia Ponomareva (Google Research); Vahab Mirrokni (Google)</li>
<li><b>Online Binary Space Partitioning Forests</b><br />Xuhui Fan (University of New South Wales)*; Bin Li (Fudan University); Scott  A. SIsson (University of New South Wales, Sydney)</li>
<li><b>Sparse Hilbert-Schmidt Independence Criterion Regression</b><br />Benjamin Poignard (Osaka University / RIKEN AIP)*; Makoto Yamada (RIKEN AIP / Kyoto University)</li>
<li><b>Sharp Thresholds of the Information Cascade Fragility Under a Mismatched Model</b><br />Wasim Huleihel (Tel-Aviv University)*; Ofer Shayevitz (Tel Aviv University)</li>
<li><b>Optimal sampling in unbiased active learning</b><br />Henrik Imberg (Department of Mathematical Sciences, Chalmers University of Technology and University of Gothenburg)*; Johan Jonasson (Chalmers University of Technology); Marina Axelson-Fisk (Department of Mathematical Sciences, Chalmers University of Technology and University of Gothenburg)</li>
<li><b>The Area of the Convex Hull of Sampled Curves: a Robust Functional Statistical Depth measure</b><br />Guillaume Staerman (Télécom Paris)*; Pavlo Mozharovskyi (Télécom Paris); Stéphan Clémençon (Télécom Paris)</li>
<li><b>Diameter-based Interactive Structure Discovery</b><br />Christopher Tosh (Columbia University)*; Daniel Hsu (Columbia University)</li>
<li><b>Utility/Privacy Trade-off through the lens of Optimal Transport</b><br />Etienne Boursier (ENS Paris Saclay)*; Vianney Perchet (ENSAE & Criteo AI Lab)</li>
<li><b>A Lyapunov analysis for accelerated gradient methods: from deterministic to stochastic case</b><br />Maxime Laborde (McGill University)*; Adam Oberman (McGill University)</li>
<li><b>Interpretable Deep Gaussian Processes with Moments</b><br />Chi-Ken Lu (Rutgers University Newark)*; Scott Cheng-Hsin Yang (Rutgers University Newark); Xiaoran Hao (Rutgers University Newark); Patrick Shafto (Rutgers University-Newark)</li>
<li><b>Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions</b><br />Lars Buesing (DeepMind)*; Nicolas Heess (DeepMind); Theophane Weber (DeepMind)</li>
<li><b>Accelerated Bayesian Optimisation through Weight-Prior Tuning</b><br />Alistair Shilton (Deakin University)*; Sunil Gupta (Deakin University, Australia); Santu Rana (Deakin University, Australia); Pratibha Vellanki (Deakin University); Cheng Li (Deakin University); Svetha Venkatesh (Deakin University); Laurence Park (Western Sydney University); Alessandra Sutti (); David Rubin (); Thomas Dorin (); Alireza Vahid (); Murray Height (); Teo Slezak ()</li>
<li><b>Variance Reduction for Evolution Strategies via Structured Control Variates</b><br />Yunhao Tang (Columbia University)*; Krzysztof Choromanski (Google Brain Robotics); Alp Kucukelbir (Columbia University)</li>
<li><b>Optimization of Graph Total Variation via Active-Set-based Combinatorial Reconditioning</b><br />Zhenzhang Ye (TU Munich)*; Thomas Möllenhoff (Technical University of Munich); Tao Wu (TU Munich); Daniel Cremers (TU Munich)</li>
<li><b>Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization</b><br />Kenji Kawaguchi (MIT); Haihao Lu (MIT)*</li>
<li><b>A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and Coordinate Descent</b><br />Eduard Gorbunov (Moscow Institute of Physics and Technology)*; Filip Hanzely (KAUST); Peter Richtarik (KAUST)</li>
<li><b>Entropy Weighted Power k-Means Clustering</b><br />Saptarshi Chakraborty (Indian Statistical Institute); Debolina Paul (Indian Statistical Institute); Swagatam Das (Indian Statistical Institute); Jason Q Xu (Duke University)*</li>
<li><b>Identifying and Correcting Label Bias in Machine Learning</b><br />Heinrich Jiang (Google Research)*; Ofir Nachum (Google)</li>
<li><b>AsyncQVI: Asynchronous-Parallel Q-Value Iteration for Discounted Markov Decision Processes with Near-Optimal Sample Complexity</b><br />Yibo Zeng (Columbia University)*; Fei Feng (University of California, Los Angeles); Wotao Yin (University of California, Los Angeles)</li>
<li><b>Active Community Detection with Maximal Expected Model Change</b><br />Dan Kushnir (Nokia Bell Labs)*; Benjamin P Mirabelli (Princeton University)</li>
<li><b>RCD: Repetitive causal discovery of linear non-Gaussian acyclic models with latent confounders</b><br />Takashi Nicholas Maeda (RIKEN Center for Advanced Intelligence Project)*; Shohei Simizu (Shiga University)</li>
<li><b>A Simple Approach for Non-stationary Linear Bandits</b><br />Peng Zhao (Nanjing University)*; Lijun Zhang (Nanjing University); Yuan Jiang (Nanjing University); Zhi-Hua  Zhou (Nanjing University)</li>
<li><b>Distributionally Robust Formulation and Model Selection for the Graphical Lasso</b><br />Pedro Cisneros (University of California, Santa Barbara)*; Alexander Petersen (University of California, Santa Barbara); Sang-Yun Oh (University of California, Santa Barbara)</li>
<li><b>Efficient Spectrum-Revealing CUR Matrix Decomposition</b><br />Cheng Chen (Shanghai Jiao Tong University)*; Ming Gu (University of California, Berkeley); Zhihua Zhang (Peking University); Weinan Zhang (Shanghai Jiao Tong University); Yong Yu (Shanghai Jiao Tong University)</li>
<li><b>Graph DNA: Deep Neighborhood Aware Graph Encoding for Collaborative Filtering</b><br />Liwei Wu (University of California, Davis)*; Hsiang-Fu Yu (Amazon); Nikhil Rao (Amazon); James Sharpnack (University of California, Davis); Cho-Jui Hsieh (UCLA)</li>
<li><b>Characterization of Overlap in Observational Studies</b><br />Michael Oberst (MIT); Fredrik D Johansson (Chalmers University of Technology)*; Dennis Wei (IBM Research); Tian Gao (IBM Research); Gabriel Brat (BIDMC Harvard Medical School); David Sontag (MIT); Kush R Varshney (IBM Research)</li>
<li><b> Modular Block-diagonal Curvature Approximations for Feedforward Architectures</b><br />Felix Dangel (University of Tuebingen)*; Stefan  Harmeling (Heinrich Heine University Düsseldorf); Philipp Hennig (University of Tübingen and MPI for Intelligent Systems Tübingen)</li>
<li><b>A Unified Statistically Efficient Estimation Framework for Unnormalized Models </b><br />Masatoshi Uehara (Harvard University)*; Takafumi Kanamori (Tokyo Institute of Technology/RIKEN AIP); Takashi Takenouchi (Future University Hakodate/RIKEN Center for Advanced Intelligence Project); Takeru Matsuda (University of Tokyo, RIKEN CBS)</li>
<li><b>More Powerful Selective Kernel Tests for Feature Selection</b><br />Jen Ning Lim (University College London)*; Makoto Yamada (RIKEN AIP / Kyoto University); Wittawat Jitkrittum (Max Planck Institute for Intelligent Systems); Yoshikazu Terada (Osaka University / RIKEN); Shigeyuki Matsui (Nagoya University); Hidetoshi Shimodaira (Kyoto University / RIKEN AIP)</li>
<li><b>Imputation estimators for unnormalized models with missing data</b><br />Masatoshi Uehara (Harvard University)*; Takeru Matsuda (University of Tokyo, RIKEN CBS); Jae  Kwang Kim (Iowa State University)</li>
<li><b>Wasserstein Style Transfer</b><br />Youssef Mroueh (IBM Research)*</li>
<li><b>Elimination of All Bad Local Minima in Deep Learning</b><br />Kenji Kawaguchi (MIT)*; Leslie Kaelbling (MIT)</li>
<li><b>Fully Decentralized Joint Learning of Personalized Models and Collaboration Graphs</b><br />Valentina Zantedeschi (Jean Monnet University)*; Aurélien Bellet (INRIA); Marc Tommasi (Lille University)</li>
<li><b>Formal Limitations on the Measurement of Mutual Information</b><br />David McAllester (TTI Chicago); Karl Stratos (Rutgers University)*</li>
<li><b>Scalable Feature Selection for (Multitask) Gradient Boosted Trees</b><br />Cuize Han (Amazon)*; Nikhil Rao (Amazon); Daria Sorokina (Amazon); Karthik Subbian (Amazon)</li>
<li><b>Model-Agnostic Counterfactual Explanations for Consequential Decisions</b><br />Amir-Hossein Karimi (MPI for Intelligent Systems, Tübingen)*; Gilles Barthe (MPI-SP and IMDEA Software Institute); Borja Balle (Amazon); Isabel Valera (MPI for Intelligent Systems, Tübingen)</li>
<li><b>Obfuscation via Information Density Estimation</b><br />Hsiang Hsu (Harvard University)*; Shahab Asoodeh (Harvard); Flavio Calmon (Harvard University)</li>
<li><b>Linear Dynamics: Clustering without identification</b><br />Chloe Hsu (University of California, Berkeley)*; Michaela Hardt (Amazon); Moritz Hardt (University of California, Berkeley)</li>
<li><b>Low-rank regularization and solution uniqueness in over-parameterized matrix sensing</b><br />Kelly L Geyer (Boston University); Anastasios Kyrillidis (Rice University )*; Amir Kalev (University of Maryland)</li>
<li><b>Robustness for Non-Parametric Classification: A Generic Attack and Defense</b><br />Yao-Yuan Yang (UCSD); Cyrus Rashtchian (UCSD); Yizhen Wang (UCSD); Kamalika Chaudhuri (University of California, San Diego)*</li>
<li><b>Contextual Online False Discovery Rate Control</b><br />Shiyun Chen (Amazon)*; Shiva Kasiviswanathan (Amazon AWS AI)</li>
<li><b>Sequential no-Substitution k-Median-Clustering</b><br />Tom Hess (Ben-Gurion University of the Negev)*; Sivan Sabato (Ben-Gurion University of the Negev)</li>
<li><b>Robust Learning from Discriminative Feature Feedback</b><br />Sanjoy Dasgupta (UCSD); Sivan Sabato (Ben-Gurion University of the Negev)*</li>
<li><b>Hermitian matrices for clustering directed graphs: insights and applications</b><br />Mihai  Cucuringu (University of Oxford and The Alan Turing Institute); Huan Li (Fudan University); He Sun (School of Informatics, The University of Edinburgh); Luca Zanetti (University of Cambridge)*</li>
<li><b>Kernel Conditional Density Operators</b><br />Ingmar Schuster (Zalando Research)*; Mattes Mollenhauer (FU Berlin); Stefan Klus (Freie Universität Berlin); Krikamol Muandet (Max Planck Institute for Intelligent Systems)</li>
<li><b>Learning Overlapping Representations for the Estimation of Individualized Treatment Effects</b><br />Yao Zhang (University of Cambridge)*; Alexis Bellot (University of Cambridge); Mihaela van der Schaar (University of Cambridge)</li>
<li><b>Additive Tree-Structured Covariance Function for Conditional Parameter Spaces in Bayesian Optimization</b><br />Xingchen Ma (KU Leuven)*; Matthew Blaschko (KU Leuven)</li>
<li><b>Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms</b><br />Ping Ma (University of Georgia)*; Xinlian Zhang (University of Georgia); Xin Xing (University of Georgia); Jingyi Ma (Central University of Finance & Economics); Michael Mahoney ("University of California, Berkeley")</li>
<li><b>The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete Probability Distributions</b><br />Feras Saad (Massachusetts Institute of Technology)*; Cameron Freer (Massachusetts Institute of Technology); Martin Rinard (MIT); Vikash Mansinghka (Massachusetts Institute of Technology)</li>
<li><b>A Fast Anderson-Chebyshev Acceleration for Nonlinear Optimization</b><br />Zhize Li ( King Abdullah University of Science and Technology)*; Jian LI (Tsinghua University)</li>
<li><b>Black Box Submodular Maximization: Discrete and Continuous Settings</b><br />Lin Chen (Yale University)*; Mingrui Zhang (Yale University); Hamed Hassani (University of Pennsylvania); Amin Karbasi (Yale)</li>
<li><b>Corruption-Tolerant Gaussian Process Bandit Optimization</b><br />Ilija Bogunovic (ETH Zurich)*; Andreas Krause (ETH Zürich); Jonathan Scarlett (National University of Singapore)</li>
<li><b>On the Convergence Theory of Gradient-Based Model-Agnostic Meta-Learning Algorithms</b><br />Alireza Fallah (MIT)*; Aryan Mokhtari (UT Austin); Asuman Ozdaglar (MIT)</li>
<li><b>Alternating Minimization Converges Super-Linearly for  Mixed Linear Regression</b><br />Avishek Ghosh (University of California, Berkeley)*; Ramchandran Kannan (Department of Electrical Engineering and Computer Science University of California, Berkeley)</li>
<li><b>Learning Gaussian Graphical Models via Multiplicative Weights</b><br />Anamay Chaturvedi (Northeastern University); Jonathan Scarlett (National University of Singapore)*</li>
<li><b>Mitigating Overfitting in Supervised Classification from Two Unlabeled Datasets: A Consistent Risk Correction Approach</b><br />Nan Lu (The University of Tokyo)*; Tianyi Zhang (The University of Tokyo); Gang Niu (RIKEN); Masashi Sugiyama (RIKEN/The University of Tokyo)</li>
<li><b>Infinitely deep neural networks as diffusion processes</b><br />Stefano Peluchetti (Cogent Labs)*; Stefano Favaro (University of Torino and Collegio Carlo Alberto)</li>
<li><b>Stable behaviour of infinitely wide deep neural networks</b><br />Stefano Peluchetti (Cogent Labs)*; Stefano Favaro (University of Torino and Collegio Carlo Alberto); Sandra Fortini (Bocconi University)</li>
<li><b>Neural Topic Model with Attention for Supervised Learning</b><br />Xinyi Wang (Hong Kong University of Science and Technology	); YI YANG (Hong Kong University of Science and Technology)*</li>
<li><b>Causal Mosaic: Cause-Effect Inference via Nonlinear ICA and Ensemble Method</b><br />Pengzhou Wu (The Graduate University for Advanced Studies)*; Kenji Fukumizu (The Institute of Statistical Mathematics)</li>
<li><b>Stochastic Bandits with Delay-Dependent Payoffs</b><br />Leonardo Cella (University of Milan)*; Nicolò Cesa-Bianchi (University of Milan)</li>
<li><b>Risk Bounds for Learning Multiple Components with Permutation-Invariant Losses</b><br />Fabien Lauer (University of Lorraine)*</li>
<li><b>Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration</b><br />Matteo Papini (Politecnico di Milano)*; Andrea Battistello (Politecnico di Milano); Marcello Restelli (Politecnico di Milano)</li>
<li><b>Independent Subspace Analysis for Unsupervised Learning of Disentangled Representations</b><br />Jan Stuehmer (Microsoft Research)*; Richard Turner (Cambridge); Sebastian Nowozin (Google Research Berlin)</li>
<li><b>A Practical Algorithm for Multiplayer Bandits when Arm Means Vary Among Players</b><br />Abbas Mehrabian (Mcgill University)*; Etienne Boursier (ENS Paris-Saclay); Emilie Kaufmann (); Vianney Perchet (ENSAE & Criteo AI Lab)</li>
<li><b>Regularity as Regularization: Smooth and Strongly Convex Brenier Potentials in Optimal Transport</b><br />François-Pierre Paty (ENSAE/CREST)*; Alexandre d'Aspremont (CNRS - Ecole Normale Supérieure); Marco Cuturi (Google and CREST/ENSAE)</li>
<li><b>On Generalization Bounds of a Family of Recurrent Neural Networks</b><br />Minshuo Chen (Georgia Tech)*; Xingguo Li (Princeton University); Tuo Zhao (Georgia Tech)</li>
<li><b>Simulator Calibration under Covariate Shift with Kernels</b><br />Keiichi Kisamori (AIST)*; Motonobu Kanagawa (EURECOM); Keisuke Yamazaki (National Institute of Advanced Industrial Science and Technology)</li>
<li><b>Convergence Rates of Gradient Descent and MM Algorithms for Bradley-Terry Models</b><br />Milan Vojnovic ()*; Se-Young Yun (KAIST); Kaifang Zhou (London School of Economics and Political Science)</li>
<li><b>A Locally Adaptive Bayesian Cubature Method</b><br />Matthew A Fisher (Newcastle University)*; Chris Oates (Newcastle University); Catherine Powell (University of Manchester); Aretha Teckentrup (University of Edinburgh)</li>
<li><b>Fast and Bayes-consistent nearest neighbors</b><br />Klim Efremenko (Ben-Gurion University); Aryeh Kontorovich (Ben-Gurion University of the Negev)*; Moshe Noivirt (Ben-Gurion University)</li>
<li><b>Explaining the Explainer: A First Theoretical Analysis of LIME</b><br />Damien Garreau (Max Planck Institute)*; Ulrike von Luxburg (U Tübingen)</li>
<li><b>A Continuous-time Perspective for Modeling Acceleration in Riemannian Optimization</b><br />Foivos Alimisis (ETH Zurich)*; Antonio Orvieto (ETH Zurich); Gary Becigneul (MIT); Aurelien Lucchi (ETH Zurich)</li>
<li><b>Deep Active Learning: Unified and Principled Method for Query and Training</b><br />Changjian Shui (Université Laval)*; Fan Zhou (Laval University); Christian Gagné (Université Laval); Boyu Wang (University of Western Ontario)</li>
<li><b>Sparse and Low-rank Tensor Estimation via Cubic Sketchings</b><br />Botao Hao (Purdue University)*; Anru Zhang (UW Madison); Guang Cheng (Purdue University)</li>
<li><b>A nonasymptotic law of iterated logarithm for general M-estimators</b><br />Arnak Dalalyan (ENSAE ParisTech - Centre for Research in Economics and Statistic)*; Nicolas Schreuder (CREST); Victor-Emmanuel Brunel (ENSAE ParisTech)</li>
<li><b>Robust Stackelberg buyers in repeated auctions</b><br />Thomas Nedelec (Criteo / ENS Paris Saclay)*; Clement Calauzenes (Criteo); Vianney Perchet (ENSAE & Criteo AI Lab); Noureddine El Karoui (UC Berkeley and Criteo AI Lab)</li>
<li><b>Radial Bayesian Neural Networks: Beyond Discrete Support In Large-Scale Bayesian Deep Learning</b><br />Sebastian Farquhar (University of Oxford)*; Michael A.  Osborne (University of Oxford); Yarin Gal (University of Oxford)</li>
<li><b>Practical Nonisotropic Monte Carlo Sampling in High Dimensions via Determinantal Point Processes</b><br />Krzysztof Choromanski (Google)*; Aldo Pacchiano (UC Berkeley); Jack Parker-Holder (University of Oxford); Yunhao Tang (Columbia University)</li>
<li><b>Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation</b><br />Si Yi Meng (University of British Columbia)*; Sharan Vaswani (Mila, Université de Montréal); Issam Hadj Laradji (University of British Columbia (UBC)); Mark Schmidt (University of British Columbia); Simon Lacoste-Julien (Mila, Université de Montréal)</li>
<li><b>Two-sample Testing Using Deep Learning</b><br />Matthias Kirchler (Hasso Plattner Institute)*; Shahryar Khorasani (Hasso Plattner Insitute for Digital Engineering); Marius Kloft (University of Southern California); Christoph Lippert (Hasso Plattner Insitute for Digital Engineering)</li>
<li><b>RATQ: A Universal Fixed-Length Quantizer for Stochastic Optimization</b><br />Prathamesh Mayekar (Indian Institute of Science)*; Himanshu Tyagi (IISC)</li>
<li><b>Rep the Set: Neural Networks for Learning Set Representations</b><br />Konstantinos Skianis (École Polytechnique)*; Giannis Nikolentzos (Ecole Polytechnique); Stratis Limnios (LIX); Michalis Vazirgiannis (École Polytechnique)</li>
<li><b>A Multiclass Classification Approach to Label Ranking</b><br />Robin Vogel (Télécom ParisTech)*; Stéphan Clémençon (Télécom ParisTech)</li>
<li><b>Conservative Exploration in Reinforcement Learning</b><br />Evrard Garcelon (None)*; Mohammad Ghavamzadeh (Facebook); Alessandro Lazaric (FAIR); Matteo Pirotta (Facebook AI Research)</li>
<li><b>A principled approach for generating adversarial images under non-smooth dissimilarity metrics</b><br />Aram-Alexandre Pooladian (McGill University)*; Chris Finlay (McGill University); Tim Hoheisel (McGill University); Adam Oberman (McGill University)</li>
<li><b>Regularization via Structural Label Smoothing</b><br />Weizhi Li (Arizona State University)*; Gautam Dasarathy (Arizona State University); Visar Berisha (Arizona State University)</li>
<li><b>Communication-Efficient Asynchronous Stochastic Frank-Wolfe over Nuclear-norm Balls</b><br />Jiacheng Zhuo (University of Texas at Austin)*; Qi Lei (UT Austin); Alex Dimakis (UT Austin); Constantine Caramanis (University of Texas)</li>
<li><b>Linear Convergence of Adaptive Stochastic Gradient Descent</b><br />Yuege Xie (University of Texas at Austin)*; Xiaoxia Wu (University of Texas at Austin); Rachel Ward (University of Texas)</li>
<li><b>Contextual Combinatorial Volatile Multi-armed Bandit with Adaptive Discretization</b><br />Andi Nika (Bilkent University)*; Sepehr Elahi (Bilkent University); Cem Tekin (Bilkent University)</li>
<li><b>A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach</b><br />Aryan Mokhtari (UT Austin); Asuman Ozdaglar (MIT); Sarath Pattathil (Massachusetts Institute of Technology)*</li>
<li><b>Bandit Convex Optimization in Non-stationary Environments</b><br />Peng Zhao (Nanjing University)*; Guanghui  Wang (Nanjing University); Lijun Zhang (Nanjing University); Zhi-Hua  Zhou (Nanjing University)</li>
<li><b>Decentralized Multi-player Multi-armed Bandits with No Collision Information</b><br />Chengshuai Shi (University of Virginia); Wei Xiong (USTC); Cong Shen (University of Virginia)*; Jing Yang (Penn State University)</li>
<li><b>Bayesian Image Classification with Deep Convolutional Gaussian Processes</b><br />Vincent Dutordoir (PROWLER.io)*; Mark van der Wilk (PROWLER.io); Artem Artemev (PROWLER.io); James Hensman (PROWLER.io)</li>
<li><b>Optimizing Millions of Hyperparameters by Implicit Differentiation</b><br />Jonathan P Lorraine (University of Toronto)*; Paul Vicol (University of Toronto); David Duvenaud (University of Toronto)</li>
<li><b>A Topology Layer for Machine Learning</b><br />Rickard Brüel Gabrielsson (Stanford University)*; Bradley J. Nelson (Stanford University); Anjan Dwaraknath (Stanford University); Primoz Skraba (Queen Mary University of London)</li>
<li><b>Differentiable Feature Selection by Discrete Relaxation</b><br />Rishit Sheth (Microsoft Research)*; Nicolo Fusi (Microsoft Research)</li>
<li><b>Private Protocols for U-Statistics in the Local Model and Beyond</b><br />James Bell (Alan Turing Institute); Aurélien Bellet (INRIA)*; Adria Gascon (The Alan Turing Institute); Tejas Kulkarni (Warwick)</li>
<li><b>Automatic Differentiation of Some First-Order Methods in Parametric Optimization</b><br />Sheheryar Mehmood (Saarland University)*; Peter Ochs (Saarland University)</li>
<li><b>DYNOTEARS: Structure Learning from Time-Series Data</b><br />Roxana Pamfil (QuantumBlack)*; Nisara Sriwattanaworachai (QuantumBlack); Shaan Desai (QuantumBlack); Philip Pilgerstorfer (QuantumBlack); Konstantinos Georgatzis (QuantumBlack); Paul Beaumont (QuantumBlack); Bryon Aragam (University of Chicago)</li>
<li><b>Unsupervised Hierarchy Matching with Optimal Transport over Hyperbolic Spaces</b><br />David Alvarez-Melis (Microsoft)*; Youssef Mroueh (IBM Research); Tommi Jaakkola (MIT)</li>
<li><b>Competing Bandits in Matching Markets</b><br />Lydia T. Liu (University of California, Berk)*; Horia Mania (UC Berkeley); Michael Jordan (UC Berkeley)</li>
<li><b>Revisiting the Landscape of Matrix Factorization</b><br />Hossein Valavi (Princeton University)*; Sulin Liu (Princeton University); Peter Ramadge (Princeton)</li>
<li><b>Value Preserving State-Action Abstractions</b><br />David Abel (Brown University)*; Nate Umbanhowar (Brown University); Khimya Khetarpal (McGill University); Dilip Arumugam (Stanford University); Doina Precup (McGill University); Michael L. Littman (Brown University)</li>
<li><b>GP-VAE: Deep Probabilistic Time Series Imputation</b><br />Vincent Fortuin (ETH Zürich)*; Dmitry A Baranchuk (MSU / Yandex); Gunnar Raetsch (ETH Zurich); Stephan M Mandt (University of California, Irvine)</li>
<li><b>Communication-Efficient Distributed Optimization in Networks with Gradient Tracking and Variance Reduction</b><br />Boyue Li (Carnegie Mellon University)*; Shicong Cen (CMU); Yuxin Chen (Princeton University); Yuejie Chi (CMU)</li>
<li><b>Optimized Score Transformation for Fair Classification</b><br />Dennis Wei (IBM Research)*; Karthikeyan  Natesan Ramamurthy (IBM Research); Flavio Calmon (Harvard University)</li>
<li><b>Variational Autoencoders for Sparse and Overdispersed Discrete Data</b><br />He Zhao (Monash University)*; Piyush Rai (IIT Kanpur); Lan Du (Monash University); Wray Buntine (Monash University); Dinh Phung (Monash University); Mingyuan Zhou (University of Texas at Austin)</li>
<li><b>Spatio-temporal alignments: Optimal transport through space and time</b><br />Hicham Janati (Inria / ENSAE-CREST)*; Marco Cuturi (Google and CREST/ENSAE); Alexandre  Gramfort (Inria)</li>
<li><b>Accelerating Smooth Games by Manipulating Spectral Shapes</b><br />Waïss Azizian (Mila, University of Montreal, Ecole Normale Supérieure de Paris)*; Damien Scieur (Samsung SAIL Montreal); Ioannis Mitliagkas (Mila & University of Montreal); Simon Lacoste-Julien (Mila, Université de Montréal); Gauthier Gidel (Mila, Université de Montréal)</li>
<li><b>Langevin Monte Carlo without smoothness</b><br />Niladri S Chatterji (UC Berkeley)*; Jelena Diakonikolas (University of California, Berkeley); Michael Jordan (UC Berkeley); Peter Bartlett ()</li>
<li><b>EM Converges for a Mixture of Many Linear Regressions</b><br />Jeongyeol Kwon (The University of Texas at Austin)*; Constantine Caramanis (University of Texas)</li>
<li><b>Locally Accelerated Conditional Gradients</b><br />Jelena Diakonikolas (University of California, Berkeley)*; Alejandro Carderera (Georgia Institute of Technology); Sebastian Pokutta (ZIB)</li>
<li><b>Coping With Simulators That Don’t Always Return</b><br />Andrew Warrington (University of Oxford)*; Frank Wood (University of British Columbia); Saeid Naderiparizi (University of British Columbia)</li>
<li><b>Post-Estimation Smoothing: A Simple Baseline for Learning with Side Information</b><br />Esther Rolf (UC Berkeley)*; Michael Jordan (UC Berkeley); Benjamin Recht (UC Berkeley)</li>
<li><b>Equalized odds postprocessing under imperfect group information</b><br />Pranjal Awasthi (Rutgers University); Matthäus Kleindessner (University of Washington)*; Jamie Morgenstern (Georgia Tech)</li>
<li><b>The True Sample Complexity of Identifying Good Arms</b><br />Julian Katz-Samuels (University of Michigan)*; Kevin Jamieson (U Washington)</li>
<li><b>Validated Variational Inference via Practical Posterior Error Bounds</b><br />Jonathan H Huggins (Boston University)*; Mikolaj J Kasprzak (University of Luxembourg); Trevor Campbell (UBC); Tamara Broderick (MIT)</li>
<li><b>A Rule for Gradient Estimator Selection, with an Application to Variational Inference</b><br />Tomas Geffner (UMass Amherst)*; Justin Domke (UMass Amherst)</li>
<li><b>Naive Feature Selection: Sparsity in Naive Bayes</b><br />Armin Askari (UC Berkeley)*; Alexandre d'Aspremont (CNRS - Ecole Normale Supérieure); Laurent El Ghaoui (UC Berkeley)</li>
<li><b>Fixed-confidence guarantees for Bayesian best-arm identification</b><br />Xuedong Shang (Inria)*; Rianne de Heide (CWI / Leiden University); Pierre Menard (Inria); Emilie Kaufmann (); Michal Valko (DeepMind)</li>
<li><b>Learning Hierarchical Interactions at Scale: A Convex Optimization Approach</b><br />Hussein Hazimeh (Massachusetts Institute of Technology)*; Rahul Mazumder (Massachusetts Institute of Technology)</li>
<li><b>OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits</b><br />Niladri S Chatterji (UC Berkeley)*; Vidya Muthukumar (UC Berkeley); Peter Bartlett ()</li>
<li><b>Optimization Methods for Interpretable Differentiable Decision Trees Applied to Reinforcement Learning</b><br />Andrew Silva (Georgia Institute of Technology)*; Matthew Gombolay (Georgia Institute of Technology); Taylor W Killian (University of Toronto); Ivan  D Jimenez (Georgia Tech); Sung-Hyun Son (MIT Lincoln Laboratory)</li>
<li><b>Sharp Analysis of Expectation-Maximization for Weakly Identifiable Models</b><br />Raaz Dwivedi (UNIVERSITY OF CALIFORNIA Berkeley); Nhat Ho (University of California, Berkeley)*; Koulik Khamaru (University of California Berkeley); Martin Wainwright (UC Berkeley); Michael Jordan (UC Berkeley); Bin Yu (University of California, Berkeley)</li>
<li><b>Stochastic Particle-Optimization Sampling and the Non-Asymptotic Convergence Theory</b><br />Jianyi Zhang (Duke University); Ruiyi Zhang (Duke University); Lawrence Carin (Duke University); Changyou Chen (University at Buffalo)*</li>
<li><b>Dynamical Systems Theory for Causal Inference with Application to Synthetic Control Methods</b><br />Yi Ding (The University of Chicago)*; Panos Toulis (Chicago Booth School of Business)</li>
<li><b>RelatIF: Identifying Explanatory Training Samples via Relative Influence</b><br />Elnaz Barshan (Element AI)*; Marc-Etienne Brunet (University of Toronto); Gintare Karolina Dziugaite (Element AI)</li>
<li><b>Ensemble Gaussian Processes with Spectral Features for Online Interactive Learning with Scalability</b><br />Qin Lu (University of Minnesota)*; Georgios Karanikolas (University of Minnesota); Yanning Shen (University of California, Irvine); Georgios B. Giannakis (University of Minnesota)</li>
<li><b>Distributionally Robust Bayesian Quadrature Optimization</b><br />Thanh Tang Nguyen (Deakin University)*; Sunil Gupta (Deakin University, Australia); Huong Ha (Deakin University); Santu Rana (Deakin University, Australia); Svetha Venkatesh (Deakin University)</li>
<li><b>Sparse Orthogonal Variational Inference for Gaussian Processes</b><br />Jiaxin Shi (Tsinghua University)*; Michalis Titsias (DeepMind); Andriy Mnih (DeepMind)</li>
<li><b>The Sylvester Graphical Lasso (SyGlasso)</b><br />Yu Wang (University of Michigan)*; Byoungwook Jang (University of Michigan); Alfred Hero (University of Michigan)</li>
<li><b>Frequentist Regret Bounds for Randomized Least-Squares Value Iteration</b><br />Andrea Zanette (Stanford University)*; David Brandfonbrener (New York University); Emma Brunskill (Stanford University); Matteo Pirotta (Facebook AI Research); Alessandro Lazaric (FAIR)</li>
<li><b>DAve-QN: A Distributed Averaged Quasi-Newton Method with Local Superlinear Convergence Rate</b><br />Saeed Soori (Toronto Univeristy)*; Konstantin Mishchenko (KAUST); Aryan Mokhtari (UT Austin); Maryam Mehri Dehnavi (University of Toronto); Mert Gurbuzbalaban (Rutgers)</li>
<li><b>Discrete Action On-Policy Learning with Action-Value Critic</b><br />Yuguang Yue (University of Texas at Austin); Yunhao Tang (Columbia University); Mingzhang Yin (University of Texas at Austin); Mingyuan Zhou (University of Texas at Austin)*</li>
<li><b>Old Dog Learns New Tricks: Randomized UCB for Bandit Problems</b><br />Sharan Vaswani (Mila, Université de Montréal)*; Abbas Mehrabian (Mcgill University); Audrey Durand (Université Laval); Branislav Kveton (Google Research)</li>
<li><b>Thompson Sampling for Linearly Constrained Bandits</b><br />Vidit Saxena (KTH Royal Institute of Technology, Stockholm)*; Joakim Jalden (KTH); Joseph Gonzalez (UC Berkeley)</li>
<li><b>Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles</b><br />Aditya Modi (Univ. of Michigan Ann Arbor)*; Nan Jiang (University of Illinois at Urbana-Champaign); Ambuj Tewari (University of Michigan); Satinder Singh (UMich)</li>
<li><b>FedPAQ: A Communication-Efficient Federated Learning Method with Periodic Averaging and Quantization</b><br />Amirhossein Reisizadeh (UC Santa Barbara)*; Aryan Mokhtari (UT Austin); Hamed Hassani (University of Pennsylvania); Ali Jadbabaie (Massachusetts Institute of Technology); Ramtin Pedarsani (UC Santa Barbara)</li>
<li><b>Online Learning Using Only Peer Prediction</b><br />Yang Liu (UCSC)*; Dave Helmbold ()</li>
<li><b>Deontological Ethics By Monotonicity Shape Constraints</b><br />Serena L Wang (Google)*; Maya Gupta (Google)</li>
<li><b>On Random Subsampling of Gaussian Process Regression: A Graphon-Based Analysis</b><br />Kohei Hayashi (Preferred Networks, Inc.)*; Masaaki Imaizumi (The Institute of Statistical Mathematics / RIKEN AIP); Yuichi Yoshida (NII)</li>
<li><b>Randomized Exploration in Generalized Linear Bandits</b><br />Branislav Kveton (Google Research)*; Manzil Zaheer (Google Research); Csaba Szepesvari (DeepMind/University of Alberta); Lihong Li (Google Brain); Mohammad Ghavamzadeh (Facebook AI Research); Craig Boutilier (Google Research)</li>
<li><b>Assessing Local Generalization Capability in Deep Models</b><br />Huan Wang (Salesforce Research)*; Nitish Shirish Keskar (); Caiming Xiong (Salesforce Research); Richard Socher (Salesforce)</li>
<li><b>Fast Algorithms for Computational Optimal Transport and Wasserstein Barycenter</b><br />Wenshuo Guo (UC Berkeley)*; Nhat Ho (University of California, Berkeley); Michael Jordan (UC Berkeley)</li>
<li><b>Adaptive Discretization for Evaluation of Probabilistic Cost Functions</b><br />Christoph Zimmer (Bosch Center for Artificial Intelligence)*; Danny Driess (University of Stuttgart); Mona Meister (Bosch Center for Artificial Intelligence); Nguyen-Tuong Duy (Bosch Center for AI)</li>
<li><b>Censored Quantile Regression Forest</b><br />Alexander Hanbo Li (Amazon)*; Jelena Bradic ()</li>
<li><b>Choosing the Sample with Lowest Loss makes SGD Robust</b><br />Vatsal Shah (University of Texas at Austin)*; Xiaoxia Wu (University of Texas at Austin); Sujay Sanghavi (UT Austin)</li>
<li><b>Learning with minibatch Wasserstein  : asymptotic and gradient properties</b><br />Kilian Fatras (IRISA/INRIA)*; Nicolas Courty (IRISA, Universite Bretagne-Sud); Rémi  Flamary (Université Côte d’Azur); Younès Zine (ENS Rennes); Remi Gribonval (INRIA)</li>
<li><b>AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC</b><br />Ruqi Zhang (Cornell University)*; A. Feder Cooper (Cornell University); Christopher De Sa (Cornell University)</li>
<li><b>On casting importance weighted autoencoder to an EM algorithm to learn deep generative models</b><br />Dongha Kim (Seoul National University); Jaesung Hwang (Seoul National University); Yongdai Kim (Seoul National University)*</li>
<li><b>Conditional Linear Regression</b><br />Diego A Calderon (UIUC NCSA); Brendan Juba (Washington University in St Louis); Sirui Li (MIT); Zongyi Li (Caltech)*; Lisa Ruan (Harvard University)</li>
<li><b>Distributionally Robust Bayesian Optimization</b><br />Johannes Kirschner (ETH Zurich)*; Ilija Bogunovic (ETH Zurich); Stefanie Jegelka (MIT); Andreas Krause (ETH Zürich)</li>
<li><b>On the optimality of kernels for high-dimensional clustering</b><br />Leena C Vankadara (University of Tübingen)*; Debarghya Ghoshdastidar (TU Munich)</li>
<li><b>Improved Regret Bounds for Projection-free Bandit Convex Optimization</b><br />Dan Garber (Technion)*; Ben Kretzu (Technion)</li>
<li><b>Variational Autoencoders and Nonlinear ICA: A Unifying Framework</b><br />Ilyes Khemakhem (UCL)*; Diederik P Kingma (Google); Ricardo Monti (UCL); Aapo Hyvarinen (INRIA & U Helsinki)</li>
<li><b>Online Learning with Continuous Variations: Dynamic Regret and Reductions</b><br />Ching-An Cheng (Georgia Institute of Technology)*; Jonathan Lee (Stanford University); Ken Goldberg (UC Berkeley); Byron Boots (University of Washington)</li>
<li><b>An Optimal Algorithm for Bandit Convex Optimization with Strongly-Convex and Smooth Loss</b><br />Shinji Ito (NEC Corporation)*</li>
<li><b>A Deep Generative Model for Fragment-Based Molecule Generation</b><br />Marco Podda (University of Pisa)*; Davide Bacciu (University of Pisa); Alessio Micheli (Universita di Pisa)</li>
<li><b>Deep Structured Mixtures of Gaussian Processes</b><br />Martin Trapp (Graz University of Technology)*; Robert Peharz (Eindhoven University of Technology); Franz Pernkopf (Graz University of Technology); Carl Edward Rasmussen (Cambridge University)</li>
<li><b>Noisy-Input Entropy Search for Efficient Robust Bayesian Optimization</b><br />Lukas Fröhlich (Bosch Center for Artificial Intelligence)*; Edgar Klenske (Bosch AI Research); Julia Vinogradska (Bosch); Christian Daniel (Robert Bosch LLC); Melanie Zeilinger (ETH Zurich)</li>
<li><b>Dependent randomized rounding for clustering and partition systems with knapsack constraints</b><br />David Harris (University of Maryland)*</li>
<li><b>Domain-Liftability of Relational Marginal Polytopes</b><br />Ondrej Kuzelka (CTU in Prague)*; Yuyi Wang (ETH Zurich)</li>
<li><b>Derivative-Free & Order-Robust Optimisation</b><br />Haitham Ammar (Huawei); Victor Gabillon (Huawei)*; Rasul Tutunov (Huawei); Michal Valko (Inria)</li>
<li><b>Stepwise Model Selection for Sequence Prediction via Deep Kernel Learning</b><br />Yao Zhang (University of Cambridge)*; Daniel Jarrett (University of Cambridge); Mihaela van der Schaar (University of Cambridge)</li>
<li><b>Dynamic content based ranking</b><br />Seppo Virtanen (University of Cambridge)*; Mark Girolami (Imperial College)</li>
<li><b>Fairness Evaluation in Presence of Biased Noisy Labels</b><br />Riccardo Fogliato (CMU)*; Alexandra Chouldechova (CMU); Max G'Sell (CMU)</li>
<li><b>Calibrated Surrogate Maximization of Linear-fractional Utility in Binary Classification</b><br />Han Bao (The University of Tokyo / RIKEN)*; Masashi Sugiyama (RIKEN/The University of Tokyo)</li>
<li><b>Decentralized gradient methods: does topology matter?</b><br />Giovanni Neglia (Inria)*; Chuan Xu (Inria); Don Towsley (University of Massachusetts Amherst); Gianmarco Calbi (Inria)</li>
<li><b>Truly Batch Model-Free Inverse Reinforcement Learning about Multiple Intentions</b><br />Giorgia Ramponi (Politecnico di Milano)*; Amarildo Likmeta (Politecnico di Milano); Alberto Maria Metelli (Politecnico di Milano); Andrea Tirinzoni (Politecnico di Milano); Marcello Restelli (Politecnico di Milano)</li>
<li><b>Beyond exploding and vanishing gradients: analysing RNN training using attractors and smoothness</b><br />Antônio H. Ribeiro (Federal University of Minas Gerais)*; Koen Tiels (Eindhoven University of Technology); Luis A. Aguirre (Federal University of Minas Gerais); Thomas Schön (Uppsala University)</li>
<li><b>Accelerated Primal-Dual Algorithms for Distributed Smooth Convex Optimization over Networks</b><br />Jinming Xu (Zhejiang University)*; Ye Tian (Purdue University); Ying Sun (Purdue University); Gesualdo Scutari (Purdue University)</li>
<li><b>Stochastic Linear Contextual Bandits with Diverse Contexts</b><br />Weiqiang Wu (London Stock Exchange); Jing Yang (Penn State University)*; Cong Shen (University of Virginia)</li>
<li><b>Purifying Interaction Effects with the Functional ANOVA: An Efficient Algorithm for Recovering Identifiable Additive Models</b><br />Benjamin J Lengerich (Carnegie Mellon University)*; Sarah Tan (Cornell University); Chun-Hao Chang (University of Toronto); Giles Hooker (Cornell University); Rich Caruana (Microsoft Research)</li>
<li><b>Balanced Off-Policy Evaluation in General Action Spaces</b><br />Arjun Sondhi (University of Washington)*; David Arbour (Adobe Research); Drew Dimmery (Facebook)</li>
<li><b>Approximate Cross-Validation in High Dimensions with Guarantees</b><br />William T Stephenson (MIT)*; Tamara Broderick (MIT)</li>
<li><b>How fine can fine-tuning be?  Learning efficient language models</b><br />Evani Radiya-Dixit (Stanford University); Xin Wang (Cerebras Systems)*</li>
<li><b>Interpretable Companions for Black-Box Models</b><br />Danqing Pan (Osaka University); Tong Wang (University of Iowa); Satoshi Hara (Osaka University)*</li>
<li><b>A PTAS for the Bayesian Thresholding Bandit Problem</b><br />Yue Qin (Indiana University); Jian Peng (UIUC); Yuan Zhou (UIUC)*</li>
<li><b>Learning Rate Adaptation for Differentially Private Learning</b><br />Antti Koskela (University of Helsinki)*; Antti Honkela (University of Helsinki)</li>
<li><b>Thresholding Graph Bandits with GrAPL</b><br />Daniel LeJeune (Rice University)*; Gautam Dasarathy (Arizona State University); Richard Baraniuk (Rice University)</li>
<li><b>Bandit optimisation of functions in the Mat\'ern kernel RKHS</b><br />David Janz (University of Cambridge)*; David R Burt (Cambridge University); Javier Gonzalez (Amazon.com)</li>
<li><b>Hypothesis Testing Interpretations and Renyi Differential Privacy</b><br />Borja Balle (Amazon); Gilles Barthe (MPI-SP and IMDEA Software Institute); Marco Gaboardi (Boston University); Justin Hsu (University of Wisconsin--Madison); Tetsuya Sato (Seikei University)*</li>
<li><b>Lipschitz Continuous Autoencoders in Application to Anomaly Detection</b><br />Young-geun Kim (Seoul National University); Yongchan Kwon (Seoul National University); Hyunwoong Chang (Texas A&M University); Myunghee Cho Paik (Seoul National University)*</li>
<li><b>Private k-Means Clustering with Stability Assumptions</b><br />Moshe Shechner (Ben-Gurion University); Or Sheffet (University of Alberta); Uri Stemmer (Ben-Gurion University)*</li>
<li><b>Momentum in Reinforcement Learning</b><br />Nino Vieillard (Google Research)*; Bruno Scherrer (Inria); Olivier Pietquin (Google Research - Brain Team); Matthieu Geist (Google Brain)</li>
<li><b>A Primal-Dual Solver for Large-Scale Tracking-by-Assignment</b><br />Stefan Haller (Heidelberg University)*; Mangal Prakash (CSBD/MPI-CBG); Lisa Hutschenreiter (Heidelberg University); Tobias Pietzsch (CSBD / MPI-CBG); Carsten Rother (University of Heidelberg); Florian Jug (CSBD/MPI-CBG); Paul Swoboda (MPI fuer Informatik, Saarbruecken); Bogdan Savchynskyy (Heidelberg University)</li>
<li><b>Precision-Recall Curves Using Information Divergence Frontiers</b><br />Josip Djolonga (Google)*; Mario Lucic (Google Brain); Marco Cuturi (Google and CREST/ENSAE); Olivier Bachem (Google Brain); Olivier Bousquet (Google); Sylvain Gelly (Google Brain)</li>
<li><b>Computing Tight Differential Privacy Guarantees Using FFT</b><br />Antti Koskela (University of Helsinki)*; Joonas Jälkö (Aalto University); Antti Honkela (University of Helsinki)</li>
<li><b>Hyperbolic Manifold Regression</b><br />Gian Maria Marconi (iit)*; Carlo Ciliberto (Imperial College London); Lorenzo Rosasco (unige, mit, iit)</li>
<li><b>Approximate Inference with Wasserstein Gradient Flows</b><br />Charlie Frogner (CBMM, MIT)*; Tomaso Poggio (MIT)</li>
<li><b>Thresholding Bandit Problem with Both Duels and Pulls</b><br />Yichong Xu (Carnegie Mellon University)*; Xi Chen (New York University); Aarti Singh (Carnegie Mellon University); Artur Dubrawski (CMU)</li>
<li><b>GAIT: A Geometric Approach to Information Theory</b><br />Jose D Gallego Posada (Mila, Université de Montréal)*; Ankit Vani (Mila, Université de Montréal); Max Schwarzer (Mila, Université de Montréal); Simon Lacoste-Julien (Mila, Université de Montréal)</li>
<li><b>On Thompson Sampling for Smoother-than-Lipschitz Bandits</b><br />James A Grant (Lancaster University)*; David S Leslie (Lancaster University)</li>
<li><b> Safe-Bayesian Generalized Linear Regression</b><br />Rianne de Heide (CWI / Leiden University)*; Alisa Kirichenko (University of Oxford); Peter Grunwald (Centrum voor Wiskunde en Informatica); Nishant Mehta (University of Victoria)</li>
<li><b>Efficient Distributed Hessian Free Algorithm for Large-scale Empirical Risk Minimization via Accumulating Sample Strategy</b><br />Majid Jahani (Lehigh University); Xi He (Lehigh University); Chenxin Ma (Lehigh University); Aryan Mokhtari (UT Austin); Dheevatsa Mudigere (Facebook); Alejandro Ribeiro (University of Pennsylvania); Martin Takac (Lehigh University)*</li>
<li><b>Contextual Constrained Learning for Dose-Finding Clinical Trials</b><br />Hyun-Suk Lee (University of Cambridge); Cong Shen (University of Virginia)*; James Jordon (University of Oxford); Mihaela van der Schaar (University of California, Los Angeles)</li>
<li><b>Support recovery and sup-norm convergence rates for sparse pivotal estimation</b><br />Mathurin Massias (Inria); Quentin Bertrand (INRIA)*; Alexandre  Gramfort (Inria); Joseph Salmon (Université de Montpellier)</li>
<li><b>Learning Entangled Single-Sample Distributions via Iterative Trimming</b><br />Hui Yuan (University of Science and Technology of China)*; Yingyu Liang (University of Wisconsin Madison)</li>
<li><b>The Quantile Snapshot Scan: Comparing Quantiles of Spatial Data from Two Snapshots in Time</b><br />Travis Moore (Oregon State University)*; Wong Weng-Keen (Oregon State University)</li>
<li><b>Statistical guarantees for local graph clustering</b><br />Wooseok Ha (UC Berkeley)*; Kimon Fountoulakis (University of Waterloo); Michael Mahoney ("University of California, Berkeley")</li>
<li><b>Learning High-dimensional Gaussian Graphical Models under Total Positivity without Adjustment of Tuning Parameters</b><br />Yuhao Wang (University of Cambridge)*; Uma Roy (MIT); Caroline Uhler (MIT)</li>
<li><b>On Pruning for Score-Based Bayesian Network Structure Learning</b><br />Alvaro Henrique Chaim Correia (Eindhoven University of Technology)*; James Cussens (University of York); Cassio de Campos (Eindhoven University of Technology)</li>
<li><b>Statistical and Computational Rates in Graph Logistic Regression</b><br />Quentin Berthet (Google Brain)*; Nicolai Baldin (University of Cambridge)</li>
<li><b>Kernels over Sets of Finite Sets using RKHS Embeddings, with Application to Bayesian (Combinatorial) Optimization</b><br />Poompol Buathong (Mahidol University); David Ginsbourger (Idiap Research Institute); Tipaluck  Krityakierne (Mahidol University)*</li>
<li><b>Rk-means: Fast Clustering for Relational Data</b><br />Ryan Curtin (RelationalAI)*; Benjamin Moseley (Carnegie Mellon University); Hung Ngo (RelationalAI); XuanLong Nguyen (University of Michigan); Dan Olteanu (University of Oxford); Maximilian Schleich (University of Oxford)</li>
<li><b>Statistical Estimation of the Poincaré constant and Application to Sampling Multimodal Distributions</b><br />Loucas Pillaud-Vivien (INRIA - Ecole Normale Supérieure)*</li>
<li><b>Integrals over Gaussians under Linear Domain Constraints</b><br />Alexandra Gessner (University of Tuebingen)*; Oindrila Kanjilal (Technical University of Munich); Philipp Hennig (University of Tuebingen)</li>
<li><b>Taxonomy of Dual Block-Coordinate Ascent Methods for Discrete Energy Minimization</b><br />Siddharth Tourani (Visual Learning Lab, HCI, IWR, Uni. Heidelberg)*; Alexander Shekhovtsov (Czech Technical University in Prague, Czech Republic); Carsten Rother (University of Heidelberg); Bogdan Savchynskyy (Heidelberg University)</li>
<li><b>PersLay: A Neural Network Layer for Persistence Diagrams and New Graph Topological Signatures</b><br />Mathieu Carriere (Columbia University)*; Frederic Chazal (INRIA); Yuichi Ike (Fujitsu); Theo Lacombe (Inria Saclay); Martin Royer (INRIA); Yuhei Umeda (Fujitsu)</li>
<li><b>MAP Inference for Customized Determinantal Point Processes via Maximum Inner Product Search</b><br />Insu Han (KAIST)*; Jennifer Gillenwater (Google)</li>
<li><b>Why Non-myopic Bayesian Optimization is Promising and How Far Should We Look-ahead? A Study via Rollout</b><br />Xubo Yue (University of Michigan)*; Raed AL Kontar (University of Michigan)</li>
<li><b>Robust Optimisation Monte Carlo</b><br />Borislav R Ikonomov (University of Edinburgh)*; Michael U. Gutmann (University of Edinburgh)</li>
<li><b>Guaranteed Validity for Empirical Approaches to Adaptive Data Analysis</b><br />Ryan Rogers (LinkedIn); Aaron Roth (University of Pennsylvania); Adam Smith (Boston University); Nathan Srebro (Toyota Technical Institute of Chicago); Om Dipakbhai Thakkar (Google)*; Blake E Woodworth (TTI-Chicago)</li>
<li><b>Fast Markov chain Monte Carlo algorithms via Lie groups</b><br />Steve Huntsman (BAE Systems)*</li>
<li><b>Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning</b><br />Tianyu Li (McGill University)*; Bogdan Mazoure (MILA,McGill University); Doina Precup (McGill University); Guillaume Rabusseau (Mila, Université de Montréal )</li>
<li><b>A Tight and Unified Analysis of Gradient-Based Methods for a Whole Spectrum of Differentiable Games</b><br />Waïss Azizian (Mila, University of Montreal, Ecole Normale Supérieure de Paris)*; Ioannis Mitliagkas (Mila & University of Montreal); Simon Lacoste-Julien (Mila, Université de Montréal); Gauthier Gidel (Mila, Université de Montréal)</li>
<li><b>Doubly Sparse Variational Gaussian Processes</b><br />Vincent Adam (PROWLER.io)*; Stefanos Eleftheriadis (PROWLER.io); Artem Artemev (PROWLER.io); Nicolas Durrande (PROWLER.io); James Hensman (PROWLER.io)</li>
<li><b>Online Convex Optimization with Perturbed Constraints: Optimal Rates against Stronger Benchmarks</b><br />Victor Valls (Yale University)*; George Iosifidis (Trinity College Dublin); Douglas Leith (Trinity College Dublin); Leandros Tassiulas (Yale University)</li>
<li><b>Persistence Enhanced Graph Neural Network</b><br />Qi Zhao (The ohio state university)*; Ze Ye (Stony Brook University); Chao Chen (Stony Brook University); Yusu Wang (Ohio State University)</li>
<li><b>Feature relevance quantification in explainable AI: A causal problem</b><br />Dominik Janzing (Amazon)*; Lenon Minorics (Amazon Research, Tuebingen); Patrick Bloebaum (Amazon Research Tuebingen)</li>
<li><b>Neural Decomposition: Functional ANOVA with Variational Autoencoders</b><br />Kaspar Märtens (University of Oxford)*; Christopher Yau (University of Birmingham)</li>
<li><b>BasisVAE: Translation-invariant feature-level clustering with Variational Autoencoders</b><br />Kaspar Märtens (University of Oxford)*; Christopher Yau (University of Birmingham)</li>
<li><b>How To Backdoor Federated Learning</b><br />Eugene Bagdasaryan (Cornell University)*; Andreas Veit (); Yiqing Hua (Cornell University); Deborah Estrin (Cornell Tech, Cornell University); Vitaly Shmatikov (Cornell University)</li>
<li><b>Exploiting Categorical Structure Using Tree-Based Methods</b><br />Brian Lucena (Numeristical)*</li>
<li><b>A Unified Stochastic Gradient Approach to Designing Bayesian-Optimal Experiments</b><br />Adam Foster (University of Oxford)*; Martin Jankowiak (Uber AI Labs); Matthew O'Meara (University of Michigan); Yee Whye Teh (University of Oxford); Tom Rainforth (University of Oxford)</li>
<li><b>Mixed Strategies for Robust Optimization of Unknown Objectives</b><br />Pier Giuseppe Sessa (ETH Zürich)*; Ilija Bogunovic (ETH Zurich); Maryam Kamgarpour (ETH Zürich); Andreas Krause (ETH Zürich)</li>
<li><b>Functional Gradient Boosting for Learning Residual-like Networks with Statistical Guarantees</b><br />Atsushi Nitanda (The University of Tokyo / RIKEN / JST PRESTO)*; Taiji Suzuki (The University of Tokyo / RIKEN)</li>
<li><b>Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity</b><br />Aaron Sidford (Stanford); Mengdi Wang (Princeton University); Lin Yang (UCLA)*; Yinyu  Ye (Standord)</li>
<li><b>Convergence Rates of Smooth Message Passing with Rounding in Entropy-Regularized MAP Inference</b><br />Jonathan Lee (Stanford University)*; Aldo Pacchiano (UC Berkeley); Michael Jordan (UC Berkeley)</li>
<li><b>Finite-Time Error Bounds for Biased Stochastic Approximation with Applications to Q-Learning</b><br />Gang Wang (University of Minnesota)*; Georgios B. Giannakis (University of Minnesota)</li>
<li><b>Automated Augmented Conjugate Inference for Non-conjugate Gaussian Process Models</b><br />Theo Galy-Fajou (TU Berlin)*; Florian Wenzel (TU Kaiserslautern); Manfred Opper (TU Berlin)</li>
<li><b>Bayesian Reinforcement Learning via Deep, Sparse Sampling</b><br />Divya Grover (Chalmers University of Technology); Debabrota Basu (Chalmers University of Technology); Christos Dimitrakakis (University of Oslo / Chalmers University of Technology)*</li>
<li><b>Deterministic Decoding for Discrete Data in Variational Autoencoders</b><br />Daniil Polykovskiy (Insilico Medicine)*; Dmitry  P Vetrov (Higher School of Economics)</li>
<li><b>Monotonic Gaussian Process Flows</b><br />Ivan Ustyuzhaninov (University of Tuebingen); Ieva Kazlauskaite (University of Bath)*; Carl Henrik Ek (University of Bristol); Neill Campbell (University of Bath)</li>
<li><b>Flexible distribution-free conditional predictive bands using density estimators</b><br />Rafael Izbicki (UFSCar)*; Gilson Shimizu (UFSCar); Rafael Stern (UFSCar)</li>
<li><b>Variational Integrator Networks for Physically Structured Embeddings</b><br />Steindor Saemundsson (Imperial College London)*; Alexander Terenin (Imperial College London); Katja Hofmann (Microsoft Research); Marc Deisenroth (University College London)</li>
<li><b>Black-Box Inference for Non-Linear Latent Force Models</b><br />Wil O C Ward (University of Sheffield)*; Tom Ryder (Newcastle University); Dennis Prangle (Newcastle University); Mauricio A Alvarez (University of Sheffield)</li>
<li><b>Importance Sampling via Local Sensitivity</b><br />Anant Raj (Max-Planck Institute for Intelligent Systems)*; Cameron Musco (Microsoft Research); Lester Mackey (Microsoft Research New England)</li>
<li><b>Convergence Analysis of Block Coordinate Algorithms with Determinantal Sampling</b><br />Mojmir Mutny (ETH Zurich)*; Michal Derezinski (UC Berkeley); Andreas Krause (ETH Zürich)</li>
<li><b>Bisect and Conquer: Hierarchical Clustering via Max-Uncut Bisection</b><br />Vaggos Chatziafratis (Stanford University, California)*; Grigory Yaroslavtsev (Indiana University, Bloomington); Euiwoong Lee (NYU); Konstantin Makarychev (Northwestern University); Sara Ahmadian (Google); Alessandro Epasto (Google); Mohammad Mahdian (Google)</li>
<li><b>Laplacian-Regularized Graph Bandits: Algorithms and Theoretical Analysis</b><br />Kaige Yang (University College London)*; Laura Toni (UCL); Xiaowen Dong (University of Oxford)</li>
<li><b>Enriched mixtures of generalised Gaussian process experts</b><br />Charles Gadd (Aalto University); Sara Wade (University of Edinburgh)*; Alexis Boukouvalas (Prowler.io)</li>
<li><b>Causal Bayesian Optimization</b><br />Virginia Aglietti (University of Warwick)*; Javier Gonzalez (Amazon.com); Xiaoyu Lu (University of Oxford); Andrei Paleyes (Amazon)</li>
<li><b>Linear predictor on linearly-generated data with missing values: non consistency and solutions</b><br />Marine Le Morvan (CNRS)*; Nicolas Prost (CMAP); julie Josse (Polytechnique/INRIA); Erwan Scornet (École Polytechnique); Gael P Varoquaux (INRIA)</li>
<li><b>A Novel Confidence-Based Algorithm for Structured Bandits</b><br />Andrea Tirinzoni (Politecnico di Milano)*; Alessandro Lazaric (FAIR); Marcello Restelli (Politecnico di Milano)</li>
<li><b>Quantitative stability of optimal transport maps and linearization of the 2-Wasserstein space</b><br />Quentin Mérigot (Laboratoire de mathématiques d'Orsay, Université Paris-Sud)*; Alex Delalande (INRIA); Frederic Chazal (INRIA)</li>
<li><b>Bayesian experimental design using regularized determinantal point processes</b><br />Michal Derezinski (UC Berkeley)*; Feynman Liang (UC Berkeley); Michael Mahoney ("University of California, Berkeley")</li>
<li><b>Non-exchangeable feature allocation models with sublinear growth of the feature sizes</b><br />Giuseppe Di Benedetto (University of Oxford)*; Francois Caron (Oxford); Yee Whye Teh (University of Oxford)</li>
<li><b>Calibrated Prediction with Covariate Shift via Unsupervised Domain Adaptation</b><br />Sangdon Park (University of Pennsylvania)*; Osbert Bastani (University of Pennysylvania); James Weimer (University of Pennsylvania); Insup Lee (University of Pennsylvania)</li>
<li><b>Inference of Dynamic Graph Changes for Functional Connectome</b><br />Dingjue Ji (Yale University)*; Junwei Lu (); Yiliang Zhang (Yale University); Siyuan Gao (Yale University); Hongyu Zhao (Yale University)</li>
<li><b>An approximate KLD based experimental design for models with intractable likelihoods</b><br />Ziqiao Ao (Shanghai Jiao Tong University); jinglai li (University of Liverpool)*</li>
<li><b>Almost-Matching-Exactly for Treatment Effect Estimation under Network Interference</b><br />Usaid Awan (Duke University); Marco Morucci (Duke University)*; Vittorio Orlandi (Duke University); Sudeepa Roy (Duke University, USA); Cynthia Rudin (Duke); Alexander Volfovsky (Duke University)</li>
<li><b>Bring Your Own Greedy''+Max: Near-Optimal 1/2-Approximations for Submodular Knapsack</b><br />Grigory Yaroslavtsev (Indiana University, Bloomington); Samson Zhou (Carnegie Mellon University); Dmitrii Avdiukhin (Indiana University, Bloomington)*</li>
<li><b>Sample complexity bounds for localized sketching</b><br />Rakshith Sharma  Srinivasa (Georgia Institute of Technology)*; Mark Davenport (Georgia Institute of Technology); Justin Romberg (Georgia Tech)</li>
<li><b>An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays</b><br />Julian Zimmert (Google)*; Yevgeny Seldin (University of Copenhagen)</li>
<li><b>Learning Dynamic and Personalized Comorbidity Networks from Event Data using Deep Diffusion Processes</b><br />Zhaozhi Qian (University of Cambridge)*; Ahmed M.  Alaa (University of California, Los Angeles); Alexis Bellot (University of Cambridge); Mihaela van der Schaar (University of California, Los Angeles); Jem Rashbass (PHE)</li>
<li><b>Tensorized Random Projections</b><br />Beheshteh T Rakhshan (Purdue University)*; Guillaume Rabusseau (Mila, Université de Montréal )</li>
<li><b>Nonparametric Estimation in the Dynamic Bradley-Terry Model</b><br />Heejong Bong (Carnegie Mellon University)*; Wanshan Li (Carnegie Mellon University); Shamindra Shrotriya (Carnegie Mellon University); Alessandro Rinaldo (Carnegie Mellon University)</li>
<li><b>Gaussian-Smoothed Optimal Transport: Metric Structure and Statistical Efficiency</b><br />Ziv Goldfeld (Cornell University); Kristjan Greenewald (IBM Research)*</li>
<li><b>Learning in Gated Neural Networks</b><br />Ashok V Makkuva (UIUC)*; Sewoong Oh (University of Washington); Sreeram Kannan (University of Washington); Pramod Viswanath (UIUC)</li>
<li><b>Validation of Approximate Likelihood and Emulator Models for Computationally Intensive Simulations</b><br />Niccolo Dalmasso (Carnegie Mellon University)*; Ann Lee (Carnegie Mellon University); Rafael Izbicki (UFSCar); Taylor Pospisil (Google LLC); Ilmun Kim (CMU); Chieh-An Lin (University of Edinburgh)</li>
<li><b>Fenchel Lifted Networks: A Lagrange Relaxation of Neural Network Training</b><br />Fangda Gu (UC Berkeley); Armin Askari (UC Berkeley)*; Laurent El Ghaoui (UC Berkeley)</li>
<li><b>Adversarial Robustness Guarantees for Classification with Gaussian Processes</b><br />Arno Blaas (University of Oxford)*; Andrea Patane (University of Oxford); Luca Laurenti (University of Oxford); Luca Cardelli (University of Oxford); Marta Kwiatkowska (Oxford University); Stephen Roberts (Oxford)</li>
<li><b>Causal inference in degenerate systems: An impossibility result</b><br />Yue Wang (Institut des Hautes Études Scientifiques)*; Linbo Wang (University of Toronto)</li>
<li><b>ChemBO: Bayesian Optimization of Small Organic Molecules with Synthesizable Recommendations</b><br />Ksenia Korovina (Carnegie Mellon University)*; Sailun Xu (Carnegie Mellon Univeristy ); Kirthevasan Kandasamy (); Willie Neiswanger (Carnegie Mellon University); Barnabas Poczos (	Carnegie Mellon University); Jeff Schneider ((organization)); Eric Xing (Petuum Inc. and CMU)</li>
<li><b>Local Differential Privacy for Sampling</b><br />Hisham Husain (The Australian National University)*; Borja Balle (Amazon); Zac Cranko (Australian National University); Richard Nock (Data61, CSIRO)</li>
<li><b>Learning Sparse Nonparametric DAGs</b><br />Xun Zheng (Carnegie Mellon University)*; Chen Dan (Carnegie Mellon University); Bryon Aragam (University of Chicago); Pradeep Ravikumar (Carnegie Mellon University); Eric Xing (Petuum Inc. and CMU)</li>
<li><b>Minimax Rank-$1$ Matrix Factorization</b><br />Venkatesh Saligrama (Boston University); Alexander Olshevsky (); Julien Hendrickx (University of Catholique de Louvain)*</li>
<li><b>Context Mover's Distance & Barycenters: Optimal Transport of Contexts for Building Representations</b><br />Sidak Pal Singh (EPFL)*; Andreas Hug (EPFL); Aymeric Dieuleveut (EPFL); Martin Jaggi (EPFL)</li>
<li><b>Data Generation for Neural Programming by Example</b><br />Judith Clymo (University of Leeds)*; Adria Gascon (The Alan Turing Institute); Brooks Paige (UCL & Alan Turing Institute); Nathanael Fijalkow (The Alan Turing Institute); Haik Manukian (University of California at San Diego)</li>
<li><b>An Inverse-free Truncated Rayleigh-Ritz Method for Sparse Generalized Eigenvalue Problem</b><br />Yunfeng Cai (Baidu Research); Ping Li (Baidu)*</li>
<li><b>The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits</b><br />Ronshee Chawla (The University of Texas at Austin)*; Abishek Sankararaman (The University of Texas at Austin); Ayalvadi Ganesh (University of Bristol); Sanjay Shakkottai (University of Texas at Austin)</li>
<li><b>Understanding the Effects of Batching in Online Active Learning</b><br />Kareem Amin (Google Research)*; Corinna Cortes (Google); Giulia DeSalvo (Google); Afshin Rostamizadeh (Google Research)</li>
<li><b>Adaptive multi-fidelity optimization with fast learning rates</b><br />Côme Fiegel (Ecole Normale Supérieure - Paris)*; Victor Gabillon (Huawei); Michal Valko (Inria)</li>
<li><b>On the interplay between noise and curvature and its effect on optimization and generalization</b><br />Valentin Thomas (Mila, Université de Montréal)*; Fabian Pedregosa (Google); Bart van Merriënboer (Google); Pierre-Antoine Manzagol (Google); Yoshua Bengio (Mila); Nicolas Le Roux (Google)</li>
<li><b>A Reduction from Reinforcement Learning to No-Regret Online Learning</b><br />Ching-An Cheng (Georgia Institute of Technology)*; Remi Tachet des Combes (Microsoft Research Montreal); Byron Boots (University of Washington); Geoff Gordon (Microsoft)</li>
<li><b>The Implicit Regularization of Ordinary Least Squares Ensembles</b><br />Daniel LeJeune (Rice University)*; Hamid Javadi (Rice University); Richard Baraniuk (Rice University)</li>
<li><b>Adaptive Exploration in Linear Contextual Bandit</b><br />Botao Hao (Purdue University)*; Tor Lattimore (DeepMind); Csaba Szepesvari (DeepMind/University of Alberta)</li>
<li><b>A Three Sample Hypothesis Test for Evaluating Generative Models</b><br />Casey  Meehan (University of California, San Diego)*; Kamalika Chaudhuri (University of California, San Diego); Sanjoy Dasgupta (UCSD)</li>
<li><b>Learning Ising and Potts Models with Latent Variables</b><br />Surbhi Goel (UT Austin)*</li>
<li><b>Learning piecewise Lipschitz functions in changing environments</b><br />Dravyansh Sharma (Carnegie Mellon University)*; Maria-Florina Balcan (Carnegie Mellon University); Travis Dick (University of Pennsylvania)</li>
<li><b>POPCORN: Partially Observed Prediction Constrained Reinforcement Learning</b><br />Joseph Futoma (Duke)*; Michael C Hughes (Tufts University); Finale Doshi-Velez (Harvard)</li>
<li><b>Optimal Approximation of Doubly Stochastic Matrices</b><br />Nikitas Rontsis (University of Oxford)*; Paul Goulart (University of Oxford)</li>
<li><b>The Expressive Power of a Class of Normalizing Flow Models</b><br />Zhifeng Kong (University of California San Diego)*; Kamalika Chaudhuri (University of California, San Diego)</li>
<li><b>Screening Data Points in Empirical Risk Minimization via Ellipsoidal Regions and Safe Loss Functions</b><br />Grégoire Mialon (Inria)*; Julien Mairal (INRIA); Alexandre d'Aspremont (CNRS - Ecole Normale Supérieure)</li>
<li><b>An Empirical Study of Stochastic Gradient Descent with Structured Covariance Noise</b><br />Yeming Wen (University of Toronto)*; Kevin Luk (BorealisAI); Maxime Gazeau (); Guodong Zhang (University of Toronto); Harris Chan (University of Toronto); Jimmy Ba (University of Toronto)</li>
<li><b>Amortized Inference of Variational Bounds for Learning Noisy-OR</b><br />Yiming Yan (University of Southern California)*; Melissa Ailem (University of Southern California); Fei Sha (Google Research)</li>
<li><b>Gain with no Pain: Efficiency of Kernel-PCA by Nystr\"om Sampling</b><br />Nicholas Sterge (Penn State University)*; Bharath Sriperumbudur (Penn State); Lorenzo Rosasco (unige, mit, iit); Alessandro Rudi (École Normale Supérieure  )</li>
<li><b>Logistic regression with peer-group effects via inference in higher-order Ising models</b><br />Constantinos Daskalakis (MIT); Nishanth Dikkala (MIT); Ioannis Panageas (SUTD)*</li>
<li><b>An Asymptotic Rate for the LASSO Loss</b><br />Cynthia Rush (Columbia University)*</li>
<li><b>Constructing a provably adversarially-robust classifier from a high accuracy one</b><br />Grzegorz Gluch (EPFL)*; Rüdiger Urbanke (EPFL)</li>
<li><b>Distributed, partially collapsed MCMC for Bayesian Nonparametrics</b><br />Kumar Avinava Dubey (Google Research)*; Michael Zhang (Princeton University); Eric Xing (Petuum Inc. and CMU); Sinead Williamson (UT Austin/CognitiveScale)</li>
<li><b>Quantized Frank-Wolfe: Faster Optimization, Lower Communication,  and Projection Free</b><br />Mingrui Zhang (Yale University)*; Lin Chen (Yale University); Aryan Mokhtari (UT Austin); Hamed Hassani (University of Pennsylvania); Amin Karbasi (Yale)</li>
<li><b>A Farewell to Arms: Sequential Reward Maximization on a Budget with a Giving Up Option</b><br />P Sharoff (University of Victoria)*; Nishant Mehta (University of Victoria); Ravi  Ganti (Walmart Labs)</li>
<li><b>Prophets, Secretaries, and Maximizing the Probability of Choosing the Best</b><br />Hossein Esfandiari (Google Research)*; MohammadTaghi Hajiaghayi (University of Maryland); Brendan Lucier (Microsoft Research New England); Michael Mitzenmacher (Harvard)</li>
<li><b>A Wasserstein Minimum Velocity Approach to Learning Unnormalized Models</b><br />Ziyu Wang (Tsinghua University)*; Shuyu Cheng (Tsinghua University); Li Yueru (Tsinghua University); Jun Zhu (Tsinghua University); Bo Zhang (Tsinghua University)</li>
<li><b>Sharp Asymptotics and Optimal Performance for Inference in Binary Models</b><br />Hossein Taheri (UC Santa Barbara)*; Ramtin Pedarsani (University of California, Santa Barbara); Christos Thrampoulidis (University of California, Santa Barbara)</li>
<li><b>A Theoretical Case Study of Structured Variational Inference for Community Detection</b><br />Mingzhang Yin (University of Texas at Austin)*; Y. X. Rachel Wang (University of Sydney); Purnamrita Sarkar (University of Texas at Austin)</li>
<li><b>Orthogonal Gradient Descent for Continual Learning</b><br />Mehrdad Farajtabar (DeepMind)*; Navid Azizan (Caltech); Alex Mott (DeepMind); Ang Li (DeepMind, Mountain View)</li>
<li><b>Hamiltonian Monte Carlo Swindles</b><br />Dan Piponi (Google); Matthew D Hoffman (Google); Pavel Sountsov (Google)*</li>
<li><b>A single algorithm for both restless and rested rotting bandits</b><br />Julien Seznec (lelivrescolaire.fr)*; Pierre Menard (Inria); Alessandro Lazaric (FAIR); Michal Valko (DeepMind)</li>
<li><b>Adversarial Robustness of Flow-Based Generative Models</b><br />Phillip  Pope (University of Maryland)*; Yogesh Balaji (University of Maryland, College Park); Soheil Feizi (University of Maryland)</li>
<li><b>The Power of Batching in Multiple Hypothesis Testing</b><br />Tijana Zrnic (University of California, Berkeley)*; Daniel Jiang (Amazon); Aaditya Ramdas (CMU); Michael Jordan (UC Berkeley)</li>
<li><b>Adversarial Risk Bounds through Sparsity based Compression</b><br />Emilio R Balda (RWTH Aachen University)*; Niklas Koep (RWTH Aachen University); Arash Behboodi (RWTH Aachen University); Rudolf Mathar (RWTH Aachen University)</li>
<li><b>Learning spectrograms with convolutional spectral kernels</b><br />Zheyang Shen (Aalto University)*; Markus Heinonen (Aalto University); Samuel Kaski (Aalto University)</li>
<li><b>Federated Heavy Hitters Discovery with Differential Privacy</b><br />Wennan Zhu (Rensselaer Polytechnic Institute)*; Peter Kairouz (Google); Brendan McMahan (Google); Haicheng Sun (Google); Wei Li (Google)</li>
<li><b>Online Batch Decision-Making with High-Dimensional Covariates</b><br />Chi-Hua Wang (Purdue University)*; Guang Cheng (Purdue University)</li>
<li><b>Sample Complexity of Estimating the Policy Gradient for Nearly Deterministic Dynamical Systems</b><br />Osbert Bastani (University of Pennysylvania)*</li>
<li><b>Scalable Gradients for Stochastic Differential Equations</b><br />Xuechen Li (Google)*; Ting-Kam Leonard Wong (University of Toronto); Ricky T. Q. Chen (University of Toronto); David Duvenaud (University of Toronto)</li>
<li><b>Understanding the Intrinsic Robustness of Image Distributions using Conditional Generative Models</b><br />Xiao Zhang (University of Virginia)*; Jinghui Chen (University of Virginia); Quanquan Gu (University of California, Los Angeles); David Evans (University of Virginia)</li>
<li><b>Uncertainty Quantification for Deep Context-Aware Mobile Activity Recognition and Unknown Context Discovery</b><br />Zepeng Huo (Texas A&M University)*; Arash PakBin (Texas A&M University); Xiaohan Chen (Texas A&M University); Nathan Hurley (Texa); Ye Yuan (Texas A&M University); Xiaoning  Qian (Texas A&M University); Zhangyang Wang (TAMU); Shuai Huang (University of Washington); Bobak J Mortazavi (Texas A&M University)</li>
<li><b>Learnable Bernoulli Dropout for Bayesian Deep Learning</b><br />Shahin Boluki (Texas A&M University)*; Randy Ardywibowo (Texas A&M University); Siamak  Zamani Dadaneh (Texas A&M University); Mingyuan Zhou (University of Texas at Austin); Xiaoning  Qian (Texas A&M University)</li>
<li><b>General Identification of Dynamic Treatment Regimes Under Interference</b><br />Eli Sherman (Johns Hopkins University)*; David Arbour (Adobe Research); Ilya Shpitser (Johns Hopkins University)</li>
<li><b>Gaussian Sketching yields a J-L Lemma in RKHS</b><br />Samory Kpotufe (Columbia University)*; Bharath Sriperumbudur (Penn State)</li>
<li><b>Wasserstein Smoothing: Certified Robustness against Wasserstein Adversarial Attacks</b><br />Alexander J Levine (University of Maryland)*; Soheil Feizi (University of Maryland)</li>
<li><b>Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning</b><br />Ming Yin (UC Santa Barbara)*; Yu-Xiang Wang (UC Santa Barbara)</li>
<li><b>Learning Dynamic Hierarchical Topic Graph with Graph Convolutional Network for Document Classification</b><br />Zhengjue Wang (Xidian University); Chaojie Wang (Xidian University); Hao Zhang (Duke University)*; Zhibin Duan (Xidian University); Mingyuan Zhou (University of Texas at Austin); Bo Chen (Xidian University)</li>
<li><b>Differentiable Causal Backdoor Discovery</b><br />Limor Gultchin (University of Oxford)*; Matt J Kusner (University College London); Varun Kanade (University of Oxford); Ricardo Silva (University College London)</li>
<li><b>Stochastic Recursive Variance-Reduced Cubic Regularization Methods</b><br />Dongruo Zhou (UCLA); Quanquan Gu (University of California, Los Angeles)*</li>
<li><b>Better Long-Range Dependency By Bootstrapping A Mutual Information Regularizer</b><br />Yanshuai Cao (BorealisAI)*; Peng Xu (Borealis AI)</li>
<li><b>On the Completeness of Causal Discovery in the Presence of Latent Confounding with Tiered Background Knowledge</b><br />Bryan Andrews (University of Pittsburgh)*</li>
<li><b>One Sample Stochastic Frank-Wolfe</b><br />Mingrui Zhang (Yale University)*; Zebang Shen (University of Pennsylvania); Aryan Mokhtari (UT Austin); Hamed Hassani (University of Pennsylvania); Amin Karbasi (Yale)</li>
<li><b>Convex Geometry of Two-Layer ReLU Networks: Implicit Autoencoding and Interpretable Models</b><br />Tolga Ergen (Stanford University)*; Mert Pilanci (Stanford)</li>
<li><b>A Robust Univariate Mean Estimator is All You Need</b><br />Adarsh Prasad (Carnegie Mellon University)*; Sivaraman Balakrishnan (CMU); Pradeep Ravikumar (Carnegie Mellon University)</li>
<li><b>Patient-Specific Effects of Medication Using Latent Force Models with Gaussian Processes</b><br />Li-Fang Cheng (Princeton University); Bianca  M Dumitrascu (Princeton  University); Michael Zhang (Princeton University); Corey Chivers (University of Pennsylvania); Michael Draugelis  (University of Pennsylvania); Kai Li (); Barbara Engelhardt (Princeton University)*</li>
<li><b>Robust Variational Autoencoders for Outlier Detection and Repair of Mixed-Type Data</b><br />Simao Eduardo (University Edinburgh)*; Alfredo Nazabal (The Alan Turing Institute); Christopher K. I. Williams (University of Edinburgh); Charles Sutton (Google)</li>
<li><b>Error bounds in estimating the out-of-sample prediction error using leave-one-out cross validation in high-dimensions</b><br />Kamiar Rahnama Rad (Baruch College)*; Wenda Zhou (Columbia University); Arian Maleki (Columbia University)</li>
<li><b>A Diversity-aware Model for Majority Vote Ensemble Accuracy</b><br />Bob Durrant  (University of Waikato); Nick Lim (University of Waikato)*</li>
<li><b>Scaling up Kernel Ridge Regression via Locality Sensitive Hashing </b><br />Amir Zandieh (epfl)*; Navid Nouri (EPFL); Ameya Velingker (Google); Michael Kapralov (EPFL); Ilya Razenshteyn (Microsoft Research)</li>
<li><b>Ordering-Based Causal Structure Learning in the Presence of Latent Variables</b><br />Daniel Bernstein (MIT)*; Basil N. Saeed (Massachusetts Institute of Technology); Chandler Squires (Massachusetts Institute of Technology); Caroline Uhler (MIT)</li>
<li><b>Budget Learning via Bracketing</b><br />Durmus Alp Emre Acar (Boston University); Aditya Gangrade (Boston University)*; Venkatesh Saligrama (Boston University)</li>
<li><b>Optimal Algorithms for Multiplayer Multi-Armed Bandits</b><br />PO-AN WANG (KTH Royal Institute of Technology)*; Alexandre Proutiere (KTH Royal Institute of Technology); Kaito Ariu (KTH); Yassir Jedra (KTH); Alessio Russo (KTH)</li>
<li><b>AP-Perf: Incorporating Generic Performance Metrics in Differentiable Learning</b><br />Rizal Fathony (Carnegie Mellon University)*; Zico Kolter (Carnegie Mellon University)</li>
<li><b>Optimal Deterministic Coresets for Ridge Regression</b><br />Praneeth Kacham (CMU)*; David Woodruff (Carnegie Mellon University)</li>
<li><b>Expressiveness and Learning of Hidden Quantum Markov Models</b><br />Sandesh M Adhikary (Univerity of Washington)*; Siddarth Srinivasan (Georgia Institute of Technology); Geoff Gordon (Microsoft); Byron Boots (University of Washington)</li>
<li><b>Solving the Robust Matrix Completion Problem via  a System of Nonlinear Equations</b><br />Yunfeng Cai (baidu research)*; Ping Li (Baidu)</li>
<li><b>Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation</b><br />Shuhang Chen (University of Florida)*; Adithya Devraj (University of Florida); Ana Busic (INRIA); Sean P Meyn (University of Florida)</li>
<li><b>Stochastic Neural Network with Kronecker Flow</b><br />Chin-Wei Huang (MILA)*; Ahmed Touati (MILA); Pascal Vincent (Facebook FIAR & Mila - Université de Montréal); Gintare Karolina Dziugaite (Element AI); Alexandre Lacoste (); Aaron Courville (MILA, Université de Montréal)</li>
<li><b>Fair Correlation Clustering</b><br />Sara Ahmadian (Google); Alessandro Epasto (Google)*; Ravi Kumar (Google); Mohammad Mahdian (Google)</li>
<li><b>Towards Competitive N-gram Smoothing</b><br />Moein Falahatgar (UC San Diego); Mesrob I Ohannessian (University of Illinois at Chicago)*; Alon Orlitsky (UC San Diego); Venkatadheeraj Pichapati (UCSD)</li>
<li><b>Multi-level Gaussian Graphical Models Conditional on Covariates</b><br />Gi Bum Kim (CARNEGIE-MELLON UNIVERSITY); Seyoung Kim ()*</li>
<li><b>Semi-Modular Inference: enhanced learning in multi-modular models by tempering the influence of components</b><br />Christian U Carmona (University of Oxford)*; Geoff Nicholls (University of Oxford)</li>
<li><b>Invertible Generative Modeling using Linear Rational Splines</b><br />Hadi Mohaghegh Dolatabadi (University of Melbourne)*; Sarah Erfani (University of Melbourne); Christopher Leckie (University of Melbourne)</li>
<li><b>LdSM: Logarithm-depth Streaming Multi-label Decision Trees</b><br />Maryam Majzoubi (NYU Tandon)*; Anna Choromanska (NYU)</li>
<li><b>Prior-aware Composition Inference for Spectral Topic Models</b><br />Moontae Lee (University of Illinois at Chicago)*; David Bindel (Cornell University); David Mimno (Cornell University)</li>
<li><b>Variational Optimization on Lie Groups, with Examples of Leading (Generalized) Eigenvalue Problems</b><br />Molei Tao (Georgia Institute of Technology)*; Tomoki Ohsawa (University of Texas at Dallas)</li>
<li><b>Best-item Learning in Random Utility Models with Subset Choices</b><br />Aadirupa Saha (Indian Institute of Science (IISc), Bangalore)*; Aditya Gopalan (Indian Institute of Science (IISc), Bangalore)</li>
<li><b>Regularized Autoencoders via Relaxed Injective Probability Flow</b><br />Abhishek Kumar (Google)*; Ben Poole (Google Brain); Kevin Murphy (Google)</li>
<li><b>Stochastic Variance-Reduced Algorithms for PCA with Arbitrary Mini-Batch Sizes</b><br />Cheolmin Kim (Northwestern University)*; Diego Klabjan (Northwestern University)</li>
<li><b>Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks</b><br />Mingchen Li (University of California, Riverside); Mahdi Soltanolkotabi (USC); Samet Oymak (University of California, Riverside)*</li>
<li><b>Scalable Nonparametric Factorization for High-Order Interaction Events</b><br />Zhimeng Pan (University of Utah); Zheng Wang (University of Utah); Shandian Zhe (University of Utah)*</li>
<li><b>Gaussianization Flows</b><br />Chenlin Meng (Stanford University)*; Yang Song (Stanford University); Jiaming Song (Stanford); Stefano  Ermon (Stanford University)</li>
<li><b>Adaptive, Distribution-Free Prediction Intervals for Deep Networks</b><br />Danijel Kivaranovic (University of Vienna)*; Kory D. Johnson (Vienna University of Economics and Business); Hannes Leeb (University of Vienna)</li>
<li><b>A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms</b><br />Philip Amortila (McGill University)*; Doina Precup (McGill University); Prakash Panangaden (); Marc G. Bellemare (Google Brain)</li>
<li><b>Automatic Differentiation of Sketched Regression</b><br />Hang  Liao (CMU); Barak Pearlmutter (Maynooth University); Vamsi K Potluru (JP Morgan AI Research)*; David Woodruff (Carnegie Mellon University)</li>
<li><b>Sublinear Optimal Policy Value Estimation in Contextual Bandits</b><br />Weihao Kong (University of Washington)*; Emma Brunskill (Stanford University); Gregory Valiant (Stanford University)</li>
<li><b>Budget-Constrained Bandits over General Cost and Reward Distributions</b><br />Semih Cayci (The Ohio State University)*; Atilla Eryilmaz (); R Srikant (UIUC)</li>
<li><b>Measuring Mutual Information Between All Pairs of Variables in Subquadratic Complexity</b><br />Mohsen Ferdosi (Carnegie Mellon University)*; Arash Gholamidavoodi (Carnegie Melon University); Hosein Mohimani (Carnegie Melon University)</li>
<li><b>Online Continuous DR-Submodular Maximization with Long-Term Budget Constraints</b><br />Omid Sadeghi (University of Washington)*; Maryam Fazel (University of Washington)</li>
<li><b>Prediction Focused Topic Models via Feature Selection</b><br />Jason Ren (Harvard University)*; Russell Kunes (Columbia University); Finale Doshi-Velez (Harvard)</li>
<li><b>Accelerated Factored Gradient Descent for Low-Rank Matrix Factorization</b><br />Dongruo Zhou (UCLA); Yuan Cao (UCLA); Quanquan Gu (University of California, Los Angeles)*</li>
<li><b>Structured Conditional Continuous Normalizing Flows for Efficient Amortized Inference in Graphical Models</b><br />Christian Weilbach (University of British Columbia)*; Boyan Beronov (University of British Columbia); Frank Wood (University of British Columbia); William S G Harvey (University of British Columbia)</li>
<li><b>Graph Coarsening with Preserved Spectral Properties</b><br />Yu Jin (University of Maryland, College Park)*; Andreas Loukas (EPFL); Joseph JaJa (University of Maryland, College Park)</li>
<li><b>A Theoretical and Practical Framework for Regression and Classification from Truncated Samples</b><br />Andrew Ilyas (MIT)*; Emmanouil Zampetakis (Massachusetts Institute of Technology); Constantinos  Daskalakis (MIT)</li>
<li><b>Permutation Invariant Graph Generation via Score-Based Generative Modeling</b><br />Chenhao Niu (Tsinghua University)*; Yang Song (Stanford University); Jiaming Song (Stanford); Shengjia Zhao (Stanford University); Aditya Grover (Stanford University); Stefano  Ermon (Stanford University)</li>
<li><b>Finite-Time Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation</b><br />Jun Sun (Zhejiang University); Gang Wang (University of Minnesota)*; Georgios B. Giannakis (University of Minnesota); Qinmin Yang (Zhejiang University); Zaiyue Yang (Southern University of Science and Technology)</li>
<li><b>Multi-attribute Bayesian optimization with interactive preference learning</b><br />Raul Astudillo (Cornell University)*; Peter Frazier (Cornell University)</li>
<li><b>On the Sample Complexity of Learning Sum-Product Networks</b><br />Ishaq Aden-Ali (McMaster University)*; Hassan Ashtiani (McMaster University)</li>
<li><b>Tighter Theory for Local SGD on Identical and Heterogeneous Data</b><br />Ahmed Khaled Ragab Bayoumi (Cairo University)*; Konstantin Mishchenko (KAUST); Peter Richtarik (KAUST)</li>
<li><b>Approximate Cross-validation: Guarantees for Model Assessment and Selection</b><br />Ashia Wilson (UC Berkeley)*; Maximilian  Kasy (Harvard University); Lester Mackey (Microsoft Research New England)</li>
<li><b>On Minimax Optimality of GANs for Robust Mean Estimation</b><br />Kaiwen Wu (University of Waterloo)*; Gavin Weiguang Ding (Borealis AI); Ruitong Huang (Borealis AI); Yaoliang Yu (University of Waterloo)</li>
<li><b>Auditing ML Models for Individual Bias and Unfairness</b><br />Songkai Xue (University of Michigan)*; Mikhail Yurochkin (IBM Research, MIT-IBM Watson AI Lab); Yuekai Sun (University of Michigan)</li>
<li><b>Stein Variational Inference for Discrete Distributions</b><br />Jun Han (Dartmouth College)*; Fan Ding (Beihang University); Xianglong Liu (Beihang University); Lorenzo Torresani (Dartmouth College & Facebook AI); Jian Peng (UIUC); Qiang Liu (UT Austin)</li>
<li><b>Revisiting Stochastic Extragradient</b><br />Konstantin Mishchenko (KAUST)*; Dmitry Kovalev (KAUST); Egor Shulgin (Moscow Institute of Physics and Technology); Peter Richtarik (KAUST); Yura Malitsky (EPFL)</li>
<li><b>A Framework for Sample Efficient Interval Estimation with Control Variates</b><br />Shengjia Zhao (Stanford University)*; Christopher Yeh (Stanford University); Stefano  Ermon (Stanford University)</li>
<li><b>Nonmyopic Gaussian Process Optimization with Macro-Actions</b><br />Dmitrii Kharkovskii (National University of Singapore); Chun Kai Ling (Carnegie Mellon University); Bryan Kian Hsiang Low (National University of Singapore	)*</li>  
</ul>
